Announcing AsciiDocLIVE

I am very pround to announce the alpha launch of AsciiDocLIVE (https://asciidoclive.com/), a free online AsciiDoc editor with instant live preview, syntax highlighting, and more!

The Idea

The idea for AsciiDocLIVE struck me one day while writing a blog post. Being a fan of AsciiDoc, I use AsciiDoc for this blog, but the repetitive cycle of edit-compile-preview is pretty tedious. I realized that there wasn’t a good AsciiDoc editor that supported live previews (like Dillinger or Markable for Markdown). So, I hacked up a quick prototype in my spare time, and AsciiDocLIVE was born!

Features

Features included in this alpha launch:

  • Instant live preview: type in the left pane, see rendered result in the right pane.

  • Smart error messages: errors generated by the AsciiDoc compiler are displayed and linked to the offending line in the source text.

  • Syntax highlighting

To be available soon:

  • Saving documents

  • Importing from and exporting to Dropbox / Google Drive

How X Window Managers Work, And How To Write One (Part II)

In Part I of this series, we examined the role of X window managers in a modern Linux/BSD desktop environment, and how they interact with the X server and applications. In Part II, we will dig into the dirty details and walk through the code of an example reparenting non-compositing window manager, basic_wm.

Introduction

Before we start with the code, let’s go over a couple of basic implementation choices such as language and API.

Language

You can write a window manager in Haskell, Python, Lisp, Go, Java, or any other language that has X bindings, i.e. a library for communicating with X servers.

I chose C++ for basic_wm, our example window manager, mainly because the C libraries for X11 are the best documented. In addition to books such as the Xlib Programming Manual, documentation can be found in the form of widely available man pages (e.g., try man XOpenDisplay at a terminal). Example usage and common patterns abound in the source code of many great window managers written in the past three decades.

We will use C++11 and C++14 features where convenient, so you will need a compatible compiler (GCC 4.9 or higher, or Clang 3.4 or higher) if you want to play with the example source code.

A Tale of Two X Libraries

There are two official C libraries for X: Xlib and XCB. Xlib, hailing from 1985, was the original X client library, and was the only official X client library until the introduction of XCB in 2001. The two libraries have very different philosophies: whereas Xlib tries to hide the X protocol behind a friendly C API with lots of bells and whistles, XCB directly exposes the plumbing beneath.

In practice, this different manifests itself most prominently in how the two libraries handle the fundamental asynchronous nature of X’s client-server architecture. Xlib attempts to hide the asynchronous X protocol behind a mixed synchronous and asynchronous API, whereas XCB exposes a fully asynchronous API.

For example, to lookup the attributes (e.g., size and position) of a window, you would write the following code using Xlib:

XWindowAttributes attrs;
XGetWindowAttributes(display, window, &attrs);
// Do stuff.

Under the hood, XGetWindowAttributes() sends a request to the X server and blocks until it receives a response; in other words, it is synchronous. On the other hand, using XCB, you would write this instead:

xcb_get_window_attributes_cookie_t cookie =
    xcb_get_window_attributes(
        connection, window);
// Do other stuff while waiting for reply.
xcb_get_window_attributes_reply_t* reply =
    xcb_get_window_attributes_reply(
        connection, cookie, nullptr);
// Do stuff.
free(reply);

The function xcb_get_window_attributes merely sends the request to the X server, and returns immediately without waiting for the reply; in other words, it is asynchronous. The client program must subsequently call xcb_get_window_attributes_reply to block on the response.

The advantage of the asynchronous approach is obvious if we consider an example where we need to retrieve the attributes of, say, 5 windows at once. Using XCB, we can immediately fire off all 5 requests to the X server, and then wait for all of them to return. With Xlib, we have send one request at a time and wait for its response to come back before we can send the next request. Therefore, we’d expect to only block for the duration of one round-trip to the X server using XCB, compared to 5 with Xlib.

The downside of XCB’s fully asynchronous approach is verbosity and a less programmer-friendly interface. The Xlib code above looks like your average C library call; the XCB code above is significantly more involved.

However, it is important to note that Xlib isn’t fully synchronous. Rather, Xlib has a mixture of synchronous and asynchronous APIs. In general, functions that do not return values (e.g., XResizeWindow, which changes the size of a window) are asynchronous, while functions that return values (e.g., XGetGeometry, which return the size and position of a window) are synchronous:

Xlib saves up requests instead of sending them to the server immediately, so that the client program can continue running instead of waiting to gain access to the network after every Xlib call. This is possible because most Xlib calls do not require immediate action by the server. This grouping of requests by the client before sending them over the network also increases the performance of most networks, because it makes the network transactions longer and less numerous, reducing the total overhead involved.

Xlib sends the buffer full of requests to the server under three conditions.

The most common is when an application calls an Xlib routine to wait for an event but no matching event is currently available on Xlib’s queue. Since, in this case, the application must wait for an appropriate event anyway, it makes sense to flush the request buffer.

Second, Xlib calls that get information from the server require a reply before the program can continue, and therefore, the request buffer is sent and all the requests acted on before the information is returned.

Third, the client would like to be able to flush the request buffer manually in situations where no user events and no calls to query the server are expected. One good example of this third case is an animated game, where the display changes even when there is no user input.

— Xlib Programming Manual §2.1.2

This is the most confusing aspect of Xlib, and a source of endless frustration for those new to X programming. One of the major motivations for the creation of XCB was to eliminate this complexity.

Many popular window managers have already been ported to XCB from Xlib for the performance benefits. If you are interested, you can read up on how the Awesome and KWin window managers were ported to XCB.

I chose to use Xlib for basic_wm, however, because as a pedagogical example, readability and simplicity is much more important than performance. In fact, I would recommend starting with Xlib first for any project and worry about porting to XCB later, as Xlib is much easier to learn and prototype with.

While an in-depth discussion of the merits of Xlib and XCB is beyond the scope of this discussion, I do recommend you check out the official article on Xlib vs. XCB as it presents a fascinating case study of API design.

Dependencies and Building

Firstly, you will need Xlib development headers in order to compile against Xlib. They are available on Debian/Ubuntu as libx11-dev, on Fedora as libX11-devel, and on Arch Linux as part of libx11.

The only additional library used by the example basic_wm code is google-glog, Google’s open source C++ logging library. It is available on Debian/Ubuntu as libgoogle-glog-dev, on Fedora as glog-devel, and on Arch Linux as google-glog.

The recommended way to build the source code is with GNU Make: just run make in the source directory. Alternatively, g++ *.cpp will also do the trick if you supply all the libraries correctly.

To test the window manager, you will likely need Xephyr along with a couple of simple X programs such as xeyes or xterm.

Step 1: Setup and Teardown

Let’s start off with a skeleton implementation of the WindowManager class, which will encapsulate all the window management logic in our example. All it will do for now is set up a connection to the X server on construction, and close that connection on destruction.

extern "C" {
#include <X11/Xlib.h>
}
#include <memory>

class WindowManager {
 public:
  // Factory method for establishing a connection to an X server and creating a
  // WindowManager instance.
  static ::std::unique_ptr<WindowManager> Create();
  // Disconnects from the X server.
  ~WindowManager();
  // The entry point to this class. Enters the main event loop.
  void Run();

 private:
  // Invoked internally by Create().
  WindowManager(Display* display);

  // Handle to the underlying Xlib Display struct.
  Display* display_;
  // Handle to root window.
  const Window root_;
};
#include "window_manager.hpp"
#include <glog/logging.h>

using ::std::unique_ptr;

unique_ptr<WindowManager> WindowManager::Create() {
  // 1. Open X display.
  Display* display = XOpenDisplay(nullptr);
  if (display == nullptr) {
    LOG(ERROR) << "Failed to open X display " << XDisplayName(nullptr);
    return nullptr;
  }
  // 2. Construct WindowManager instance.
  return unique_ptr<WindowManager>(new WindowManager(display));
}

WindowManager::WindowManager(Display* display)
    : display_(CHECK_NOTNULL(display)),
      root_(DefaultRootWindow(display_)) {
}

WindowManager::~WindowManager() {
  XCloseDisplay(display_);
}

void WindowManager::Run() { /* TODO */ }

The main function in main.cpp:

#include <cstdlib>
#include <glog/logging.h>
#include "window_manager.hpp"

using ::std::unique_ptr;

int main(int argc, char** argv) {
  ::google::InitGoogleLogging(argv[0]);

  unique_ptr<WindowManager> window_manager(WindowManager::Create());
  if (!window_manager) {
    LOG(ERROR) << "Failed to initialize window manager.";
    return EXIT_FAILURE;
  }

  window_manager->Run();

  return EXIT_SUCCESS;
}

Even if you have never programmed Xlib before, this should not be hard to understand. WindowManager::Create() is a static factory method that sets up a connection to an X server via XOpenDisplay(); we will let XOpenDisplay() figure out which X server to connect to from the DISPLAY environment variable. The connection is represented by the opaque Display structure. We call XCloseDisplay() on the saved Display* in the destructor to close the connection.

The other function of note is DefaultRootWindow(), which returns the default root window for a given X server. Technically, an X server may have several root windows in some rare multihead setups, but let’s not worry about that here.

If you run this program now, it should connect to the X server, close the connection, and exit. Hooray!

Step 2: Initialization

Now, let’s dig into the mysterious Run() function above. We’ll start with the initialization steps required after opening an X server connection. In window_manager.hpp:

class WindowManager {
  ...
  // Xlib error handler. It must be static as its address is passed to Xlib.
  static int OnXError(Display* display, XErrorEvent* e);
  // Xlib error handler used to determine whether another window manager is
  // running. It is set as the error handler right before selecting substructure
  // redirection mask on the root window, so it is invoked if and only if
  // another window manager is running. It must be static as its address is
  // passed to Xlib.
  static int OnWMDetected(Display* display, XErrorEvent* e);
  // Whether an existing window manager has been detected. Set by OnWMDetected,
  // and hence must be static.
  static bool wm_detected_;
};
void WindowManager::Run() {
  // 1. Initialization.
  //   a. Select events on root window. Use a special error handler so we can
  //   exit gracefully if another window manager is already running.
  wm_detected_ = false;
  XSetErrorHandler(&WindowManager::OnWMDetected);
  XSelectInput(
      display_,
      root_,
      SubstructureRedirectMask | SubstructureNotifyMask);
  XSync(display_, false);
  if (wm_detected_) {
    LOG(ERROR) << "Detected another window manager on display "
               << XDisplayString(display_);
    return;
  }
  //   b. Set error handler.
  XSetErrorHandler(&WindowManager::OnXError);

  // 2. Main event loop.
  ...
}

int WindowManager::OnWMDetected(Display* display, XErrorEvent* e) {
  // In the case of an already running window manager, the error code from
  // XSelectInput is BadAccess. We don't expect this handler to receive any
  // other errors.
  CHECK_EQ(static_cast<int>(e->error_code), BadAccess);
  // Set flag.
  wm_detected_ = true;
  // The return value is ignored.
  return 0;
}

int WindowManager::OnXError(Display* display, XErrorEvent* e) { /* Print e */ }

We first select substructure redirection and substructure notify events on the root window. This is discussed in more detail in the Substructure Redirection section in Part I; to recap, this allows the window manager to intercept requests from top level windows, and subscribe to events concerning the same. Only one X client can select substructure redirection on the root window at any given time; the second client to attempt to do so will get a BadAccess error.

Catching this error is somewhat tricky, however. XSelectInput, like all asynchronous Xlib functions, does not actually send a request to the X server, but instead only queues the request and returns. Hence, we have to explicitly flush the request queue with XSync (see our discussion above in A Tale of Two X Libraries). We set up a temporary error handler, OnWMDetected, to catch errors during this XSync invocation.

Next, we set up our regular error handler which will be invoked for any future errors. Our implementation, which logs the error and continues, will be an important debugging aid as we implement and test our window manager. I will not show it here for the sake of brevity; for reference, check it out in window_manager.cpp.

Step 3: The Event Loop

Now let’s add to Run() method above the signature construct of every modern GUI program - the event loop. In window_manager.cpp:

void WindowManager::Run() {
  // 1. Initialization.
  ...

  // 2. Main event loop.
  for (;;) {
    // 1. Get next event.
    XEvent e;
    XNextEvent(display_, &e);
    LOG(INFO) << "Received event: " << ToString(e);

    // 2. Dispatch event.
    switch (e.type) {
      case CreateNotify:
        OnCreateNotify(e.xcreatewindow);
        break;
      case DestroyNotify:
        OnDestroyNotify(e.xdestroywindow);
        break;
      case ReparentNotify:
        OnReparentNotify(e.xreparent);
        break;
      ...
      // etc. etc.
      ...
      default:
        LOG(WARNING) << "Ignored event";
    }
  }
}

If you have done low-level GUI programming before, this should look very familiar. We sit in an event loop and repeatedly fetch the next event with XNextEvent() and dispatch it to the appropriate handlers.

The structure of the XEvent type is typical of a polymorphic C structure. Each type of event carries different attributes and corresponds to an event struct, such as XKeyEvent, XButtonEvent, and XConfigureEvent. The first field of each struct is always int type. The XEvent type is a C union of all the event structs plus int type:

typedef struct _XKeyEvent {
  int type;
  // Fields specific to XKeyEvent.
  ...
} XKeyEvent;

typedef struct _XButtonEvent {
  int type;
  // Fields specific to XButtonEvent.
  ...
} XButtonEvent;

// etc.
...

typedef union _XEvent {
  int type;
  XKeyEvent xkey;
  XButtonEvent xbutton;
  // etc.
  ...
} XEvent;

This way, the type is always available regardless of the type of event and requires no additional storage. The same pattern can be observed in GTK+/GLib, Python’s C API, and many other object-oriented C APIs.

In basic_wm, the event handlers follow the naming convention of OnFoo(), where Foo is the type of the event, so it should be straightforward to figure out who does what.

What’s Next

We now have a basic skeleton for our window manager, and we can start filling in the meat - the event handlers. The million-dollar question is, what events does a window manager handle, and what should it do with them?

In the next installment in this series, we’ll answer that question by diving into the complex ways window managers, clients and the user interact with each other via X events. In the meantime, you’re more than welcome to check out the code for basic_wm on GitHub.

DEBUG trap and PROMPT_COMMAND in Bash

Update 03/08/2016: A patch by Dan Stromberg adds a PS0 variable to Bash that greatly simplifies what’s described in this article. This patch will likely be merged into Bash 4.4. Please refer to his post for details.

The DEBUG trap

The DEBUG trap is an extremely handy feature of Bash. The idea is pretty straightforward: if you run

trap "echo Hello" DEBUG

then Bash will run echo Hello before it executes each subsequent command. For example:

~/Scratch $ ls
Hello
file1 file2
~/Scratch $ echo Bye
Hello
Bye

A caveat, however, is that the DEBUG trap is triggered once per simple command; if you have command lists or control structures, the trap will be triggered multiple times. For example, using the setup above:

~/Scratch $ echo 1 && echo 2; echo 3
Hello
1
Hello
2
Hello
3
~/Scratch $ if [ -e /etc/passwd ]; then echo "/etc/passwd exists"; fi
Hello
Hello
/etc/passwd exists

What if we only want to run a command once per composite command, like the preexec hook in zsh?

Enter PROMPT_COMMAND.

PROMPT_COMMAND

The idea behind PROMPT_COMMAND is also very simple: if you run

PROMPT_COMMAND="echo Bye"

then Bash will execute echo Bye before it prints each subsequent prompt (i.e., after it has finished executing the previous command line). For example, using the setup above:

~/Scratch $ echo 1; echo 2
Hello
1
Hello
2
Hello
Bye

Note that the DEBUG trap is triggered again for PROMPT_COMMAND, in addition to the user-supplied commands.

Combining the DEBUG trap and PROMPT_COMMAND

By combining the DEBUG trap and PROMPT_COMMAND, we can now hack Bash to run some code right before and right after executing a full command. For example, try adding this to your ~/.bashrc:

# This will run before any command is executed.
function PreCommand() {
  if [ -z "$AT_PROMPT" ]; then
    return
  fi
  unset AT_PROMPT

  # Do stuff.
  echo "Running PreCommand"
}
trap "PreCommand" DEBUG

# This will run after the execution of the previous full command line.  We don't
# want it PostCommand to execute when first starting a bash session (i.e., at
# the first prompt).
FIRST_PROMPT=1
function PostCommand() {
  AT_PROMPT=1

  if [ -n "$FIRST_PROMPT" ]; then
    unset FIRST_PROMPT
    return
  fi

  # Do stuff.
  echo "Running PostCommand"
}
PROMPT_COMMAND="PostCommand"

The result:

~/Scratch $ echo 1; echo 2 && echo 3
Running PreCommand
1
2
3
Running PostCommand

This gives rise to some neat applications, such as a command timer script I wrote that prints out the execution time of each command:

Please feel free to check it out on GitHub :)

Happy Bash hacking!

How X Window Managers Work, And How To Write One (Part I)

Window managers are one of the core components of the modern Linux/BSD desktop. It is not an exaggeration to say that they define to a large degree our day-to-day user experience, as they are responsible for deciding how individual windows look, move around, react to input, and organize themselves. Hence, almost 30 years since the first X window manager, we still argue over the merits of different window managers, and new window managers continue to reinvent how we interact with our digital world.

In this series of posts, I hope to demystify how window managers work, and how you might go about writing one yourself.

I will be quoting quite heavily from the seminal Xlib Programming Manual (3rd Ed, 1994) by Adrian Nye and published by O’Reilly. Despite its age, it remains amazingly relevant and is the best available introductory text to the internals of X, which has not changed over the past two decades as much as you’d think. Since you could buy the book plus shipping for less than the price of a cup of coffee, I strongly recommend it to anyone interested in learning more about X. In addition, its chapter 16 also covers the basics of window management.

The Role of an X Window Manager

Let’s start with an examination of the role of the window manager in a modern Linux/BSD desktop environment.

The Rights of X Window Managers

Unlike other windowing systems such as Microsoft Windows or Mac OS X, X does not dictate a window manager or how a window manager should behave. This decision is to thank for the wild diversity of X window managers we see today.

X is somewhat unusual in that it does not mandate a particular type of window manager. Its developers have tried to make X itself as free of window management or user interface policy as possible.
— Xlib Programming Manual §1.2.3

In fact, it does not even require a window manager to be present at all:

Unlike citizens, the window manager has rights but not responsibilities. Programs must be prepared to cooperate with any type of window manager or with none at all […].
— Xlib Programming Manual §1.2.3

This is in stark contrast to the integrative approach of other GUI systems. On Mac OS X and Unity, for example, an application could not possibly function without the window manager, as the latter is responsible for rendering a part of the application’s interface (e.g., menus).

The Responsibilities of X Window Managers

As you probably already know, X operates in a server-client model. An X server controls one or more physical display devices as well as input devices (mouse, keyboard, etc.). An application that wants to interact with these devices assumes the role of an X client. An X server and its clients may run on the same computer, in which case they communicate via domain sockets, or on different computers, in which case they communicate through TCP/IP.

A window manager is a regular X client. It doesn’t have any superuser privileges or keys to kernel backdoors; it is a normal user process that is allowed by the X server to call a set of special APIs. X ensures that no more than one window manager is running at any given point by denying a client access to these APIs if another client currently has access. The first client to attempt to access these APIs always succeeds.

A window manager communicates with the windows it manages through two X mechanisms: properties and events. We will discuss these in detail in later sections, but the takeaway is that the communication happens through the X server, not directly between the window manager and other applications.

This is illustrated by the following diagram:

Role of a Window Manager

How an X Window Manager Manages Windows

Let’s now dive into the details of how a window manager does its job.

The Window Hierarchy

When we think about modern GUIs, we usually use the term widgets or controls to refer to UI elements such as buttons, scrollbars, or text boxes, and the term windows to refer to a container for such widgets that has its own name and can be independently moved around, closed, resized, etc..

X, however, was designed to be as low-level as possible. The fundamental UI model that X provides, upon which UI frameworks such as GTK+ and Qt are built, is that of an hierarchy of rectangles. In X terminology, all top level windows and all UI elements within are windows. In other words, a window, is any rectangular area that is an unit of user interaction and/or graphic display.

Windows are organized into a tree hierarchy. At the root of the hierarchy is the root window, a virtual, invisible window that has the same size as the screen, and is always present. Top level windows are direct children of the root window. UI elements within a top level window are descendants of that window.

An
example X window

For example, consider the dialog box above from the Xfce desktop environment. The entire dialog is an X window. All UI elements in the dialog box - the magnifying glass icon, the text box, the green down arrow, the Close and Launch buttons, and the icons inside those buttons - are also X _window_s.

The whole dialog window is a child of the root window. The magnifying glass icon, the text box, and the Close and Launch buttons are children of the dialog window. The green down arrow is a child of the text box window, and the icons in the Close and Launch buttons are children of those buttons respectively.

An important thing to note about X windows is that a child window is clipped to the boundaries of its parent:

A child may be positioned partially or completely outside its parent window, but output to the child is displayed and input received only in the area where the child overlaps with the parent.
— Xlib Programming Manual §2.2.2

For example, if we increase the width of the text box in the dialog above by 2x without changing the size of the dialog box, the portion of the text box that extends outside of the dialog box will become invisible, and clicking on it will not send an event to the text box.

A window manager manages top level windows - that is, direct children of the root window.

Substructure Redirection

In the absence of a window manager, when an application wants to do something with a window - move it, resize it, show/hide it, etc. - its request is directly processed by the X server, and that’s the end of that. A window manager, however, needs to intercept these requests. For example, a window manager may need to know that a new top level window has been created and displayed, in order to draw window decorations (e.g. minimize / maximize / close buttons) around it. It may also need to know that an existing top level window has been resized, in order to redraw the window decorations to reflect the change.

The mechanism that allows a window manager to intercept such requests is called substructure redirection.

This is how substructure redirection works. Suppose we have a window W. If a program M registers for substructure redirection on W, a matching request to modify any direct child window of W will not be executed by the X server. Instead, the X server redirects this request to the program M, which can do whatever it wants with the request, including denying the request outright or granting the request with modifications. More formally,

The structure, as the term is used here, is the location, size, stacking order, border width, and mapping status of a window. The substructure is all these statistics about the children of a particular window. This is the complete set of information about screen layout that the window manager might need in order to implement its policy. Redirection means that an event is sent to the client selecting redirection (usually the window manager), and the original structure−changing request is not executed.
— Xlib Programming Manual §16.2

Note that only direct children of a window W is affected by substructure redirection on W, not any windows further down the hierarchy.

This gets interesting when we consider substructure redirection on the root window:

When the window manager selects SubstructureRedirectMask on the root window, an attempt by any other client to change the configuration of any child of the root window will fail. Instead an event describing the layout change request will be sent to the window manager. The window manager then reads the event and determines whether to honor the request, modify it, or deny it completely. If it decides to honor the request, it calls the routine that the client called that triggered the event with the same arguments. If it decides to modify the request, it calls the same routine but with modified arguments.
— Xlib Programming Manual §16.2

In other words, a window manager must register for substructure redirection on the root window, which causes all creation, destruction, reconfiguration etc. of top level windows - which are direct children of the root window - to be routed to the window manager. This is the magic hook into the X server that window managers rely on to do their job.

This relationship is shown in the following diagram:

Role of a Window Manager

Finally, the X server only allows one running program to register for substructure redirection on any given window at any given time. An attempt to register for substructure redirection on a window will fail if another X client has already done the same on the same window, and has not unregistered, disconnected from the X server, or crashed. Since all window managers must register for substructure redirection on the root window, this latter acts as a locking mechanism that prevents two or more window managers from running simultaneously on the same screen.

Reparenting

In the example dialog box above, we see a title bar with, for example, little buttons for minimizing, maximizing, and closing the window. These UI elements are not created by the application, but by the window manager, via a process known as reparenting or framing:

A window manager can decorate [top level] windows on the screen with titlebars and place little boxes on the titlebar with which the window can be moved or resized. This is only one possibility […].

To do this, the window manager creates a child of the root somewhat larger than the top level window of the application. Then it calls XReparentWindow(), specifying the top level window of the application as win and the new parent [window it just created] as parent. win and all its descendants will then be descendants of parent.

— Xlib Programming Manual §16.3

In other words, if we were to run an X application without a window manager present, the top level window of the application would be a direct child of the root window. With a window manager running, however, the top level window of the application may be reparented by the window manager; it becomes a child of a frame window which is created by the window manager, and which is itself a direct child of the root window. The window manager can add other UI elements inside this frame window alongside the application’s top level window as it sees fit.

Therefore, I’ve kind of lied to you several paragraphs ago: the dialog box shown earlier is really a child window within a frame window created by Xfce's window manager, Xfwm, along with other UI elements for window management:

Reparenting is what allows different window managers to draw different window decorations, and thereby achieve a consistent look-and-feel across windows. However, there are also window managers that do not reparent at all: these are called non-reparenting window managers. There are two reasons why a window manager would not want to reparent:

  1. If a window manager does not draw window decorations around top level windows , it obviously has no need to reparent them. Examples: xmonad, dwm.

  2. Compositing window managers do not always need to reparent windows; we will discuss why below. Example: Compiz. This is not true for all compositing window managers, however; for example, GNOME’s default window manager, Mutter, is a reparenting comopositing window manager.

Let’s now consider substructure redirection in the context of reparenting. When a top level window W is first shown (map'ped in X jargon), the window manager is notified because it has registered for substructure redirection on the root window, and a top level window is a direct child of the root window. It then creates a frame F and reparents W, so that W becomes a child of F, which itself is a child of the root window. But since now W is no longer a direct child of the root window, the window manager will no longer be able to intercept changes to W!

Therefore, a reparenting window manager must also subsequently register for substructure redirection on each frame window it creates.

Compositing

Compositing window managers are a relatively new development. Compositing support in X was added in late 2004, a full decade after the last edition of Xlib Programming Manual. The first compositing window managers, Xfwm and Compiz, launched in early 2005.

So, what exactly does a compositing window manager do?

In our discussion above on substructure redirection and reparenting, we saw how a window manager can respond to various requests for a top level window - to display/hide it (map/unmap in X jargon), to resize it, to move it, etc.. But we didn’t talk about how to deal with what’s inside the top level windows.

Indeed, from the perspective of the window manager, top level windows are black boxes; they each manage their own descendant windows (UI elements), perhaps through a framework such as GTK+ or Qt, and the window manager has no right to interfere there. The application that creates a top level window is responsible for rendering and handling events for any descendant windows (UI elements), and does so directly through X. This is shown in the first diagram above.

As the computing power of graphics hardware grew, so did people’s expectations from their window managers. With hardware acceleration, it became possible to build much more computationally intensive user interfaces, such as the (in)famous Desktop Cube in Compiz:

or the Shift Switcher:
Let’s take a moment to think about how we can implement an interface such as the Shift Switcher above. When the user triggers this interface, we need to:

  1. Render each top level window and all its descendant windows (UI elements) to an off-screen, in-memory buffer, instead of directly to the hardware.

  2. Transform (rotate, contort, etc.) each buffer according to our design.

  3. Composite the transformed buffers into a final buffer along with a background and any other floating UI elements else we need to display.

  4. Create an overlay window that covers the entire screen and hides all other windows.

  5. Render the final buffer into the overlay window.

There are a number of challenges:

  • We must be able to retrieve the displayed contents of top level windows. However, as we described earlier, top level windows render their contents directly through X, without going through the window manager.

  • We need to update our interface in real time as the contents of the top level windows change. However, top level windows do not notify window managers when their contents change, again because they render their contents directly through X.

  • A top level window A may overlap with another top level window B below, which means a portion of B isn’t currently displayed. Our interface, however, must capture the full rendering of A and B, regardless of overlapping regions.

  • All this complex compositing process is computationally intensive and requires hardware acceleration to function adequately.

It is clear that none of this would be possible without some heavy cooperation from the X server. Enter the Composite extension:

Many user interface operations would benefit from having pixel contents of window hierarchies available without respect to sibling and antecedent clipping. In addition, placing control over the composition of these pixel contents into a final screen image in an external application will enable a flexible system for dynamic application content presentation.
— X Composite Extension

The Composite extension provides a mechanism to request the X server not to render a specific window and its descendants directly to hardware, but to a special buffer maintained by the X server, and do so without the normal clipping and overlap computations. This buffer can then be read and used by the client that made the request.

That’s exactly what a compositing window manager does: it will ask X to render each top level window to an off-screen, in-memory buffer and composite the results into an overlay window itself. And it needs to do this not just for fancy task switcher interfaces as in our example, but also to achieve effects like translucency, animations, soft shadows, and the like.

This is illustrated in the following diagram:

Let’s end this section by considering whether a compositing window manager should reparent top level windows.

Since a compositing window manager already knows the size and position of all top level windows, it’s easy for it to just draw window decorations during compositing into the overlay window using graphics operations (e.g. OpenGL), without ever creating an actual X frame window and reparenting. Some compositing window managers do operate this way.

On the other hand, a window manager may need to support both a compositing and a non-compositing mode, for compatibility with older or unsupported graphics hardware. In this case, it needs to implement reparenting and frame windows for non-compositing mode anyway, so additionally implementing drawing window decorations using graphics operations becomes redundant. This is why may other compositing window managers still choose to reparent.

Ready For Some Code?

If you’ve read everything up to this point, you’re probably holding back the urge to cry out "Enough talk - show me some code!" I don’t blame you.

In the next installment in this series, I will walk you through a basic implementation of a reparenting, non-compositing window manager. Impatient? Check out the code on GitHub!

The Most Popular Fonts On The Web: A Study

If you’ve ever worked on a web site, you already know that choosing the right fonts is one of the most important aesthetic decisions in the design of a site. But, like all aesthetic decisions, it is a highly subjective.

I decided to try to bring a little bit of objectivity into the equation by finding out, empirically, what fonts the most popular sites on the web are using today.

Methodology

I wrote a Python program that crawls the front page of the 100,000 most popular web sites according to Alexa’s top sites list. It parses the HTML using BeautifulSoup, and parses all in-line and linked CSS stylesheets using cssutils. It then looks for font and font-family rules in the CSS rules, and stores the normalized form of each font in those rules in order. The result is kept in a SQLite database for analysis.

This all sounds pretty straightforward, but in fact it took me two weeks to build a crawler that doesn’t choke on all the crazy crap people throw at browsers. I will write up some of the more interesting cases in a separate post.

The final crawl success rate was about 96%. The size of all HTML and stylesheets downloaded was about 30GB.

First, let’s take a look at the first-choice/most preferred fonts, i.e., the ones that appear first in font-family rules. These fonts most closely represent the intention of the web site designers without compatibility compromises.

For each font, we calculate the number of distinct web sites that have at least one CSS font or font-family rule that lists the font as the first choice. This means that if a site has two rules, one listing Arial and another listing Times, it will count towards both. Thus, the numbers add up to much higher than the total number of sites.

Without further ado, here’s the breakdown of the top fonts on the web:

(The chart is interactive - hover/click to see actual numbers.)

A couple of observations:

  • Sans-serif fonts dominate the web. The top 25 fonts list only includes 4 serif fonts, compared to 16 sans-serif fonts. The most popular serif-font, Georgia, ranks #4 on the list with about 20% of sites.

  • Monospace fonts get no love. Most sites don’t bother specifying a custom monospace font; the most common monospace font specification just uses the browser default (12%). The most popular monospace font, Monaco, is featured on 7.2% of sites. Both of these are quite high, in fact, considering that we only crawl the front page of these sites.

  • The most popular fonts of each family:

    • Sans-serif: Arial (#1, 62%)

    • Serif: Georgia (#4, 20%)

    • Monospace: Monaco (#11, 7.2%)

  • Old Microsoft fonts still reign over the web. The top non-Microsoft font, Helvetica Neue, ranks #6 with an impressive 18% of top web sites.

  • Font Awesome is awesome. About 4.6% of the top 100K sites already use it for universal icons.

  • The Chinese web is on the rise. 3 out of the top 25 fonts are Chinese fonts, compared to 21 Latin fonts (and 1 symbol font). No other scripts made their way into the list.

What If We Considered…

  • Fonts in all positions: (graph) not much difference.

  • The top 1,000 web sites only: (graph) even more Arial (#1, 74%).

Next, let’s look at first-choice fonts used in headings or titles.

I used a really crude metric to determine whether a CSS rule matches a heading: whether the CSS selector is for an H1H6 tag or contains the strings "heading" or "title". While this finds only a subset of actual headings, it is not a bad approximation as it matches rules on about 58% of the top 100,000 web sites.

Let’s start with the graph:

Observations:

  • Header fonts are more diverse. While Microsoft fonts still reign supreme, a number or more exotic fonts are also on the list. In terms of distribution, there’s a much longer tail. It may be a sign that designers pay a lot more attention to fonts used in headings.

  • Serif fonts are more popular in headings than elsewhere. While Arial still claims to top spot with 27.31% of sites, the top serif font, Georgia, rises to 2nd place with 9% of sites.

  • No monospace fonts in headings. Not surprising.

  • Among Chinese fonts, 宋体 is more commonly used for body text, while 雅黑 (or Yahei) is more commonly used for heading text.

Concluding Thoughts

Arial and friends are the most hated fonts ever. Quoting The Scourge of Arial:

Despite its pervasiveness, a professional designer would rarely - at least for the moment - specify Arial.

And yet, Arial is still the default font on the vast majority of sites on the web, followed closely by its friends. Why?

Some people actually have a reason to use them but most use it mindlessly - just because everyone else does. Often, no thought is given to design of the site, let alone typography.

This is pretty sad. Paraphrasing Stop Using Lame Fonts, a good font stack has the potential to really make a site design shine, and it’s a shame web designers aren’t exploiting this opportunity.

Caveats

There are a couple of caveats with this dataset.

First, I am only surveying the landing page of each site. For many sites, notably those whose main interface is hidden behind a login (Facebook, Evernote, etc.), we may not be finding the styles that matter most to users. However, I figured since it would be pretty poor design to build a landing page that is aesthetically inconsistent with the rest of the site, it is not very likely that the font selection on the landing page would be too different from the fonts used elsewhere on the site. Of course, without creating fake accounts on these sites, we have no way to verify.

The numbers are not weighted by prominence on the web pages. One could argue that the font of the main body text on a page should carry more weight than that of the tiny disclaimer text at the bottom which no one reads. It would be tricky to determine what the right weight function is though.

The numbers are not weighted by the prominence of the web sites. For example, we could make it so that Google’s use of Arial would get more weight than some random obscure site’s use of Arial, as Google’s choice impacts many more users and was probably the result of deliberation by a team of expert designers. Again, it’s not entirely obvious how the weights should be assigned.

I am only looking at CSS rules in <style> tags or linked stylesheets. That means I am ignoring style="…" attributes in tags, <font> elements, or dynamically assigned fonts (i.e., through JavaScript). I would be surprised if this turns out to be a big loss though.

Next Steps

I can think of quite a few other useful questions one might find the answer to from this dataset:

  • What are the most common font pairings?

  • What are the popular choices for heading fonts, given that I’ve chosen font X as my body text font?

  • What are the most common fonts on news sites/forums/productivity web apps/social media sites?

  • Is there any correlation between font choice and site popularity?

What do you think?

Please feel free to download the top 100 first-choice fonts as a CSV file.