Hands-on graphics without X11
A crash course on direct framebuffer and keyboard access via NetBSD’s wscons
Look at these two consoles:
Same font, same colors, same… everything? Other than for the actual text they display, they look identical, don’t they? But the one on the right can do things that the one on the left cannot. Witness this:
A square? OK, meh, we had those in the DOS days with box-drawing characters. But a circle?! That’s only possible because the console on the right is a hybrid console that supports mixing the usual textual grid of a terminal with overlapping graphics.
Now, if you have been following the development of EndBASIC, this is not surprising. The defining characteristic of the EndBASIC console is that it’s hybrid as the video shows. What’s newsworthy, however, is that the EndBASIC console can now run directly on a framebuffer exposed by the kernel. No X11 nor Wayland in the picture (pun intended).
But how? The answer lies in NetBSD’s flexible wscons framework, and this article dives into what it takes to render graphics on a standard Unix system. I’ve found this exercise exciting because, in the old days, graphics were trivial (mode 13h, anyone?) and, for many years now, computers use framebuffer-backed textual consoles. The kernel is obviously rendering “graphics” by drawing individual letters; so why can’t you, a user of the system, do so too?
wscons overview
wscons(4), or Workstation Console in its full form, is NetBSD’s framework to access the physical console attached to a computer.
wscons abstracts the details of the hardware display and input devices so that the kernel and the user-space configuration tools can treat them all uniformly across the tens of platforms that NetBSD supports. If you use wsconsctl(8) on a modern amd64 laptop to control its display, you use wsconsctl on an ancient vax box to control its display too.
The output architecture of wscons is composed of multiple devices, layered like this:
wsdisplay(4) sits at the top of the stack and implements the console in hardware-independent terms. The functionality at this level includes handling of VT100-like sequences, cursor positioning logic, text wrapping, scrolling decisions, etc.
Under wsdisplay sit the drivers that know how to access specific hardware devices. These include, among others: vga(4), which does not do graphics at all; genfb(4), which is a generic framebuffer driver that talks to the “native” framebuffer of the system (e.g. the one configured by the EFI); and radeonfb(4), which implements an accelerated console on AMD cards. These drivers know how to initialize and interact with the hardware.
Under the graphical drivers sits vcons(4), the driver that implements one or more graphical consoles in terms of a grid of pixels. vcons is parameterized on “raster operations” (rasops), a set of virtual methods to perform low-level operations. An example is the
moverows
method, which is used by wsdisplay to implement scrolling in the most efficient way provided by the hardware. vcons provides default (inefficient) implementations of these methods, but the upper drivers like radeonfb can provide hardware-accelerated specializations when instantiating vcons. vcons also interacts with wsfont(4) to render text to the console.
The input architecture of wscons is similar in terms of layering of devices, albeit somewhat simpler:
wsmux(4) is an optional component that multiplexes multiple input devices under a single virtual device for event extraction.
wskbd(4) sits at the top of the stack (not accounting for wsmux) and implements generic keyboard handling. The functionality at this level includes translating keycodes to layouts, handling key input repetition, and more. wskbd exposes a stream of wsevents to user-space so that user-space can process state changes (e.g. key presses).
Under wskbd sit the device drivers that know how to deal with specific hardware devices. These include, among others: ukbd(4) for USB keyboard input and pckbd(4) for PC/AT keyboard input. These drivers wait for hardware input, generate events, and provide a map of keycodes to key symbols to the upper layer so that wskbd can operate in generic terms.
The input architecture can handle other types of devices like mice and touch panels (both via wsmouse(4)), but I’m not going to cover those here. Just know that they sit under wsmux at the equivalent level of wskbd and produce a set of wsevents in the exact same manner as wskbd.
Querying framebuffer properties
As you can sense from the overview, the whole architecture under wsdisplay is geared towards video devices… if it wasn’t for the vga driver: in the common case, wsdisplay is backed by a graphical framebuffer managed by vcons for text rendering, yet the user only sees a textual console. But if the kernel has direct access to the framebuffer, so should user-space too.
The details on how to do this click if you read through the operations described in the wsdisplay manual page. In particular, you may notice the WSDISPLAYIO_GET_FBINFO
call which retrieves extended information about, you guessed it, a framebuffer display.
Let’s try it: I wrote a trivial program to open the display device (named /dev/ttyE0
for reasons that escape me), call this function, and store the results in an fbinfo
structure:
// wsdisplay-fbinfo.c
// https://jmmv.dev/src/netbsd-graphics-wo-x11/wsdisplay-fbinfo.c
#include <sys/param.h>
#include <sys/types.h>
#include <sys/ioctl.h>
#include <dev/wscons/wsconsio.h>
#include <err.h>
#include <fcntl.h>
#include <stdlib.h>
#include <unistd.h>
int main(void) {
// Open the main wsdisplay device.
int fd = open("/dev/ttyE0", O_RDWR | O_NONBLOCK | O_EXCL);
if (fd == -1)
err(1, "open failed");
// Query information about the framebuffer.
struct wsdisplayio_fbinfo fbinfo;
if (ioctl(fd, WSDISPLAYIO_GET_FBINFO, &fbinfo) == -1)
err(1, "ioctl failed");
close(fd);
exit(EXIT_SUCCESS);
}
Hmm, but this program does not have any visible output, right? The code just queries the framebuffer information and does nothing with it. The reason is that the content of the wsdisplayio_fbinfo
structure is large and I didn’t want to pretty-print it myself. I thought it’d be fun to show you how to use GDB to inspect large data structures and how to script the process. Here, look:
gdb -q \
-ex 'set print pretty on' \
-ex 'break exit' \
-ex 'run' \
-ex 'frame 1' \
-ex 'print fbinfo' \
-ex 'cont' \
-ex 'quit' \
./wsdisplay-fbinfo
This call to GDB starts the sample program shown above and automates various GDB commands to set a breakpoint, step through the program, and pretty-print the fbinfo
structure right before exiting. When we execute this command as root (which is important to get access to the /dev/ttyE0
device), we get this:
Neat. We get sensible stuff from the kernel! fbi_width
is 640 and fbi_height
is 480, which matches the 640x480 resolution I have configured in my test VM.
Drawing to the framebuffer
But note these other fields in the structure printed above:
struct wsdisplayio_fbinfo {
uint64_t fbi_fbsize;
uint64_t fbi_fboffset;
// ... more fields ...
}
The fbi_fbsize
and fbi_fboffset
fields are begging us to use mmap
to memory-map the area of the device starting at fbi_fboffset
and spanning fbi_fbsize
bytes. Presumably we can write to the framebuffer if we do this, but beforehand, we have to switch the console to “framebuffer mode” by using the WSDISPLAYIO_SMODE
(“set mode”) call. This call accepts an integer to indicate which mode to set:
WSDISPLAYIO_MODE_EMUL
: Set the display to emulating (text) mode. This is the default operation mode of wsdisplay and configures the console to “emulate” a text terminal.WSDISPLAYIO_MODE_MAPPED
: Set the display to mapped (graphics) mode. This allows access to the framebuffer and allows themmap
operation to succeed.WSDISPLAYIO_MODE_DUMBFB
: Set the display to mapped (framebuffer) mode. This is similar toWSDISPLAYIO_MODE_MAPPED
and, for our purposes in the demo below, works the same. I haven’t found a concise description of how these two differ, but from my reading of the code, the “mapped” mode offers access to the framebuffer as well as device-specific control registers, whereas “dumb framebuffer” just exposes the framebuffer memory.
In any case. Once we know that we have to switch the console device to a graphical mode before mapping the framebuffer, and having access to the pixel format described in the fbinfo
structure… drawing something fun is just a few byte manipulation operations away:
// wsdisplay-draw.c
// https://jmmv.dev/src/netbsd-graphics-wo-x11/wsdisplay-draw.c
#include <sys/param.h>
#include <sys/types.h>
#include <sys/ioctl.h>
#include <sys/mman.h>
#include <dev/wscons/wsconsio.h>
#include <err.h>
#include <fcntl.h>
#include <stdlib.h>
#include <unistd.h>
int main(void) {
// Open the main wsdisplay device.
int fd = open("/dev/ttyE0", O_RDWR | O_NONBLOCK | O_EXCL);
if (fd == -1)
err(1, "open failed");
// Query information about the framebuffer.
struct wsdisplayio_fbinfo fbinfo;
if (ioctl(fd, WSDISPLAYIO_GET_FBINFO, &fbinfo) == -1)
err(1, "ioctl failed");
// Ensure the framebuffer aligns with the expectations of our demo
// code below.
if (fbinfo.fbi_bitsperpixel != 32)
errx(1, "bitsperpixel not supported by this demo");
if (fbinfo.fbi_pixeltype != WSFB_RGB)
errx(1, "pixeltype not supported by this demo");
// Configure the wsdisplay to enter "dumb framebuffer" mode.
unsigned int mode = WSDISPLAYIO_MODE_DUMBFB;
if (ioctl(fd, WSDISPLAYIO_SMODE, &mode) == -1)
err(1, "ioctl failed");
// Map the framebuffer memory. Must come after the SMODE ioctl.
uint32_t *ptr = (uint32_t*)mmap(
0, fbinfo.fbi_fbsize, PROT_READ | PROT_WRITE, MAP_SHARED,
fd, fbinfo.fbi_fboffset);
if (ptr == MAP_FAILED)
err(1, "mmap failed");
// Fill the screen multiple times with pixels of different
// colors to render a simple animation.
size_t pixels = fbinfo.fbi_fbsize / sizeof(uint32_t);
int off = 0;
for (int i = 0; i < 100; i++) {
int r = off; int g = off; int b = off;
for (size_t i = 0; i < pixels; i++) {
r = (r + 1) % 255; g = (g + 2) % 255; b = (b + 3) % 255;
ptr[i] = 0
| (r << fbinfo.fbi_subtype.fbi_rgbmasks.red_offset)
| (g << fbinfo.fbi_subtype.fbi_rgbmasks.green_offset)
| (b << fbinfo.fbi_subtype.fbi_rgbmasks.blue_offset);
}
off += 10;
usleep(1);
}
// Configure the wsdisplay to enter "console emulation" mode.
// In other words: return to the console.
mode = WSDISPLAYIO_MODE_EMUL;
if (ioctl(fd, WSDISPLAYIO_SMODE, &mode) == -1) {
err(1, "ioctl failed");
}
close(fd);
return EXIT_SUCCESS;
}
And if we run this:
Voila. We’ve got graphics without paying the X11 startup tax. Switching from the console to graphics is instantaneous, like in the good old mode 13h days.
Handling keyboard input
Rendering graphics is just half of the puzzle when writing an interactive application though. The other half is handling input. And, for that, we have to turn to the wskbd device.
After we switch the console to mapped mode, keystrokes don’t go to stdin
anymore. We have to write code to explicitly read from an attached keyboard, and we can do this via the /dev/wskbd0
device representing the first attached keyboard.
Once we open the keyboard device for reading, wscons sends us its own representation of events known as wsevents. We can write a trivial program to read one key press:
// wskbd-trivial.c
// https://jmmv.dev/src/netbsd-graphics-wo-x11/wskbd-trivial.c
#include <sys/param.h>
#include <sys/types.h>
#include <sys/ioctl.h>
#include <dev/wscons/wsconsio.h>
#include <err.h>
#include <fcntl.h>
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
int main(int argc, char** argv) {
// Open the main wskbd device.
int fd = open("/dev/wskbd0", O_RDONLY);
if (fd == -1)
err(1, "open failed");
// Wait for one key down press only.
for (;;) {
struct wscons_event ev;
int ret = read(fd, &ev, sizeof(ev));
if (ret == -1)
err(1, "read failed");
if (ev.type == WSCONS_EVENT_KEY_DOWN) {
printf("value: %d, char '%c'\n", ev.value, (char)ev.value);
break;
}
}
close(fd);
return EXIT_SUCCESS;
}
But… if we try to run it and press a key, say k
, we might get:
# ./wskbd-trivial
value: 37, char '%'
# █
Huh. We pressed k
but the character we got is %
. Not what we expected! Well, as it turns out, the “value” that wsevents report for key presses (37 in this case) is the raw keycode of the key. This is hardware-specific and needs to be translated to an actual symbol via a keymap.
One feature of wskbd is that it exposes the keymap as configured in the kernel so there is a single source of truth for the machine. We can query a portion of it with another program:
// wskbd-map.c
// https://jmmv.dev/src/netbsd-graphics-wo-x11/wskbd-map.c
#include <sys/param.h>
#include <sys/types.h>
#include <sys/ioctl.h>
#include <dev/wscons/wsconsio.h>
#include <dev/wscons/wsksymdef.h>
#include <err.h>
#include <fcntl.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <unistd.h>
int main(int argc, char** argv) {
// Open the main wskbd device.
int fd = open("/dev/wskbd0", O_RDONLY);
if (fd == -1)
err(1, "open failed");
// Allocate space for the biggest possible keymap.
struct wscons_keymap map[WSKBDIO_MAXMAPLEN];
memset(map, 0, sizeof(struct wscons_keymap) * WSKBDIO_MAXMAPLEN);
// Get the keymap from the device.
struct wskbd_map_data data;
data.maplen = WSKBDIO_MAXMAPLEN;
data.map = map;
if (ioctl(fd, WSKBDIO_GETMAP, &data) == -1)
err(1, "ioctl failed");
// Dump keymap entries.
printf("Keymap length: %u entries\n", data.maplen);
for (size_t i = 0; i < data.maplen; i++) {
// Skip printing entries that are not for letters.
if (map[i].command != KS_voidSymbol)
continue;
char normal = map[i].group1[0];
char shifted = map[i].group1[1];
if (normal < 'a' || normal > 'z')
continue;
printf("Keycode %zd: '%c', '%c'\n", i, normal, shifted);
}
close(fd);
return EXIT_SUCCESS;
}
And if we run it, we might get:
# ./wskbd-map
Keymap length: 222 entries
Keycode 16: 'q', 'Q'
Keycode 17: 'w', 'W'
Keycode 18: 'e', 'E'
Keycode 19: 'r', 'R'
Keycode 20: 't', 'T'
Keycode 21: 'y', 'Y'
Keycode 22: 'u', 'U'
Keycode 23: 'i', 'I'
Keycode 24: 'o', 'O'
Keycode 25: 'p', 'P'
Keycode 30: 'a', 'A'
Keycode 31: 's', 'S'
Keycode 32: 'd', 'D'
Keycode 33: 'f', 'F'
Keycode 34: 'g', 'G'
Keycode 35: 'h', 'H'
Keycode 36: 'j', 'J'
Keycode 37: 'k', 'K'
Keycode 38: 'l', 'L'
Keycode 44: 'z', 'Z'
Keycode 45: 'x', 'X'
Keycode 46: 'c', 'C'
Keycode 47: 'v', 'V'
Keycode 48: 'b', 'B'
Keycode 49: 'n', 'N'
Keycode 50: 'm', 'M'
# █
This dump is telling us how keycodes map to symbols, both in “normal” and in shifted form. If we look up keycode 37, we indeed find the letter k
. With this, it’s just an SMOP to come up with a program that parses the keymap as exposed by wskbd and converts keycodes to something useful.
This is all good and dandy, but what happens if the keyboard is not connected when you try to open /dev/wskbd0
? (Spoiler: the open
call fails.) Or what happens if your computer has more than one keyboard attached? (Spoiler: you can only read events from one.) This is where wsmux comes to the rescue—a device driver that multiplexes multiple input devices into one.
By default, the system reserves /dev/wsmux0
as the multiplexer for all attached mice and /dev/wsmux1
as the multiplexer for all attached keyboards. We can define our own too via the wsmuxctl(8) command line utility.
wsmux then supports “hot plugging”. You can then open a /dev/wsmuxN
device even when there is no physical hardware attached, and whenever a peripheral is connected, it automatically becomes part of the mux. So, if we modify the program above to open /dev/wsmux1
instead of /dev/wskbd0
, the program will be resilient to missing keyboards and it’ll recognize multiple keyboards. Easy peasy!
What will you build?
You are now equipped with the basics to write graphical applications on a NetBSD system (and maybe OpenBSD too) without running X11. I know NetBSD may not be your jam, but it is a good choice for embedded projects due to its console architecture and other features like its build system.
If the code above still seems mysterious, you can read the source code for the xf86-video-wsfb and xf86-input-ws drivers for X.org. The code is easy enough to read, although it is longer because it has to support all the bells and whistles of wsdisplay and wskbd. (I took shortcuts above by making various assumptions on pixel formats and the like.)
And, guess what, I am indeed working on an embedded project! A little dev box that can boot straight into EndBASIC with super-fast boot times and for which I couldn’t afford the X11 startup penalty.
Stay tuned. In the meantime, what will YOU build? For those of us in the U.S., there is a 3-day weekend ahead and this can be a good distraction. Have fun!