Austin Shafer

home contact

Notes on FreeBSD's WiFi Source Code

Last august I bought a Netgear A6100 usb wifi adapter for my desktop because somehow my house doesn't have ethernet wired through the walls. Luckily it works on FreeBSD, but there are two problems:

  1. The speed is absolutely terrible. Running windows on the same computer I get 30+ MBps on a bad day, but somehow on FreeBSD the best I can get is 1.1MBps. Half the time it slows down to 100 KBps...
  2. The ATT router we have does not get along with wpa_supplicant, meaning I can't authenticate and connect to the wifi. Instead I have to run a second (and much worse) router that wpa_supplicant does get along with.

Today I'm procrastinating my school projects and am reading FreeBSD's 802.11 code. There's a lot there and it's hard to follow, so these notes are really for my future self.

Also these notes aren't necessarily beginner friendly, and they are certainly not comprehensive. Below I outline some of the functions and structures that I read through while getting my bearings in rtwn.ko. I'll then include some profiling of my adapter downloading a large file over the wifi.

I highly recommend plugging the highlighted identifiers into cscope or fxr.watson.org to take a look for yourself.

Notable code features

A lot of the 802.11 kernel functions are actually in the manpages, so I'll try not to repeat them here. A good point to start reading is the ieee80211com struct, as it seems to be the "main" 802.11 structure. Other notable structs include ieee80211_channel, which represents a channel to send/receive on, and ieee80211vap, which represents a virtual access point.

There are two options for getting more speed out of my adapter: improve the current channel's performance or try my luck at implementing 802.11ac. FreeBSD actually has support for 802.11ac in the net80211 protocol library, but drivers need to implement 802.11ac as well.

One of the tricky things is even finding out what 802.11ac even is. A good starting place is the physical mode. There isn't a named ac mode like I expected, instead there are two VHT (very high throughput) modes that represent acg and ac.

 55  * PHY mode; this is not really a mode as multi-mode devices
 56  * have multiple PHY's.  Mode is mostly used as a shorthand
 57  * for constraining which channels to consider in setting up
 58  * operation.  Modes used to be used more extensively when
 59  * channels were identified as IEEE channel numbers.
 60  */
 61 enum ieee80211_phymode {
       ...
 74         IEEE80211_MODE_VHT_2GHZ = 12,   /* 2GHz, VHT */
 75         IEEE80211_MODE_VHT_5GHZ = 13,   /* 5GHz, VHT */
 76 };

sbin/ifconfig/ifieee80211.c's modename maps the enumerator above to the string names. In addition to the modes there are also VHT channel identifiers, which are aggregated in IEEE80211_CHAN_VHT.

1522 static void
1523 rtwn_getradiocaps(struct ieee80211com *ic,
1524     int maxchans, int *nchans, struct ieee80211_channel chans[])
1525 {
1526         struct rtwn_softc *sc = ic->ic_softc;
1527         uint8_t bands[IEEE80211_MODE_BYTES];
1528         int i;
1529
1530         memset(bands, 0, sizeof(bands));
1531         setbit(bands, IEEE80211_MODE_11B);
1532         setbit(bands, IEEE80211_MODE_11G);
1533         setbit(bands, IEEE80211_MODE_11NG);
1534         ieee80211_add_channels_default_2ghz(chans, maxchans, nchans,
1535             bands, !!(ic->ic_htcaps & IEEE80211_HTCAP_CHWIDTH40));
1536
1537         /* XXX workaround add_channel_list() limitations */
1538         setbit(bands, IEEE80211_MODE_11A);
1539         setbit(bands, IEEE80211_MODE_11NA);
1540         for (i = 0; i < nitems(sc->chan_num_5ghz); i++) {
1541                 if (sc->chan_num_5ghz[i] == 0)
1542                         continue;
1543
1544                 ieee80211_add_channel_list_5ghz(chans, maxchans, nchans,
1545                     sc->chan_list_5ghz[i], sc->chan_num_5ghz[i], bands,
1546                     !!(ic->ic_htcaps & IEEE80211_HTCAP_CHWIDTH40));
1547         }
1548 }

rtwn sets the modes that it supports in rtwn_getradiocaps, calling ieee80211_add_channels_default_2ghz and ieee80211_add_channel_list_5ghz to channels. This is called by rtwn_attach during initalization. As you can see the highest mode supported at the moment is 802.11n.

Many drivers like rtwn have three parts: the main logical driver, a driver that connects to the pci bus, and a driver that connects to the usb bus. sys/dev/rtwn/if_rtwn.c is the main driver. rtwn also needs to register itself with the IEEE80211 stack, which is done with ieee80211_ifattach.

sys/net80211/ieee80211_vht.c seems to be the work in progress location of the VHT (and therefore 802.11ac) modes. ieee80211_vht_adjust_channel is in charge of promoting the interface up to the full VHT speeds.

If 802.11ac was enabled in rtwn, r12a_tx_raid would also have to be updated with the VHT modes. There's even a TODO comment requesting that.

126 static void
127 r12a_tx_raid(struct rtwn_softc *sc, struct r12a_tx_desc *txd,
128     struct ieee80211_node *ni, int ismcast)
129 {
       ...
160         switch (mode) {
161         case IEEE80211_MODE_11A:
162                 raid = R12A_RAID_11G;
163                 break;
164         case IEEE80211_MODE_11B:
165                 raid = R12A_RAID_11B;
166                 break;
       ...
default:
193                 /* TODO: 80 MHz / 11ac */
194                 device_printf(sc->sc_dev, "unknown mode(2) %d!\n", mode);
195                 return;
196         }
197
198         txd->txdw1 |= htole32(SM(R12A_TXDW1_RAID, raid));
199 }

The function above is in charge of setting the rate adaptation modes, which is what "raid" refers to.

sys/dev/rtwn/rtl8821a/ is the device specific code used for my netgear A6100. These directories seem to share a lot of code, I think that they sort of build on each other. The usb driver for rtwn has a list of supported devices in rtwn_devs. Below that is rtwn_chip_usb_attach which specifies an array of attach functions which line up with the different "generations" of rtwn hardware. Mine calls r21au_attach to set up the function pointers that the usb code should call. r12a_fill_tx_desc will then call r12a_fill_tx_desc to set up the channel mode.

Profiling

I used dtrace to profile the kernel and then plugged the output into Brendan Gregg's flamegraph tool. That gave me a pretty interactive svg to mess with.

# Record kernel and user stacks at 99HZ
#  exit after 30 seconds
$ dtrace -x stackframes=100 -x ustackframes=100 \
  	 -n 'profile-99 { @[stack(), ustack(), execname] = count(); } \
	 tick-30s { exit(0); }' \
	 -o /tmp/d.out

# Generate the flamegraph
$ ./FlameGraph/stackcollapse.pl /tmp/d.out > /tmp/d.folded
$ ./FlameGraph/flamegraph.pl /tmp/d.folded > flamegraph.svg

rtwn flamegraph

Click here for the interactive version.

There are a few bars of this graph with empty space above them, signaling that they are execution time hotspots:

  • rtwn_bulk_rx_callback
  • rtwn_rx_common
  • USB is dominating the execution profile

What's immediately obvious is the fact that performing USB operations far outweighs any 802.11 code. I guess I spent my time taking notes on the wrong protocol stack. The ieee80211_input_mimo winds up starting a USB transaction, which will get handled by the same rtwn_bulk_rx_callback.

By digging through the code we can see that rtwn_rx_common calls rtwn_extend_rx_tsf, which it turn causes a USB transfer to take place. Note that rtwn_read_4 is the wrapper that actually calls the usb read method, but it is either too cheap or gets inlined so it doesn't show up in the trace.

The real problem is that reading any register like this is going to trigger a USB transfer. Obviously the overhead of an entire transfer grossly outweighs the cost of reading one register. Once you start asking for multiple registers the USB overhead really adds up and.. well... you see the graph.

Side note: you can see the types of USB transfers specific to rtwn at sys/dev/rtwn/usb/rtwn_usb_ep.c. Here is the bulk receive one:

61 static const struct usb_config rtwn_config_common[RTWN_N_TRANSFER] = {
 62         [RTWN_BULK_RX] = {
 63                 .type = UE_BULK,
 64                 .endpoint = UE_ADDR_ANY,
 65                 .direction = UE_DIR_IN,
 66                 .flags = {
 67                         .pipe_bof = 1,
 68                         .short_xfer_ok = 1
 69                 },
 70                 .callback = rtwn_bulk_rx_callback,
 71         },
	...

So maybe there is some room for improvement? I guess I'll have to continue this notes series with some USB goodness next time. I figured that USB had something to do with my performance issues, but I didn't expect it to completely dominate. I guess in retrospect it should have been obvious.