Debugging intermittent failure in a custom-ish keyboard

Recently(ish), one of my friends modified the mysterium mechanical keyboard and distributed some to our friends. I own one of these, which works perfectly. However, when Emma assembled hers, she found that hers would occasionally stop working, requiring a few seconds to reconnect to the computer and become usable again. This was bad enough that it led to her stopping using it entirely. Eventually, I got bored one day and tried to fix it. The process of which, while somewhat janky, seems to have worked fine, and is described below.

First, I took the keyboard and attempted to use it myself, to try to get a feel for the fault. It came up pretty quickly and frequently, which is nice, and I immediately noticed a few qualities:

From this, I started to suspect a microcontroller error - this keyboard uses bitbanged USB, and I wondered if that could be dying if it had to send too much in quick succession.

Unfortunately, I had no idea what to do about this, which left me helpless for a while.

Fortunately, this diagnosis was also entirely incorrect.

The real diagnosis came when I tried to find the minimal case to replicate the problem. Specifically, I found that if I just keysmashed, it usually didn't happen - just when I tried to type sentences. However, it did happen when I was mashing backspace. Experimenting with mashing keys, I found that I could reliably cause it to happen by mashing space or backspace, but not really anything else.

This made me suspect it might be ESD related - backspace and space are large, stabilised keys, and I felt that could generate static, or something. Furthermore, taking off the keycap seemed to stop this from happening. (The author does not actually know anything about ESD)

Unfortunately, this was also completely wrong. Taking the keycap off didn't actually stop it from happening, just made it rarer. However, in trying to test this, I found that it only happened when pressing the key quite hard - which led to the discovery that you could do it on other keys if you also hit those hard.

This finally led me to the conclusion that it might just be the board flexing when you hit a key hard. The most likely culprit in this case seemed to me to me the USB connector, which is delicate and has incredibly small solder points.

I tried to reflow the solder on the USB connector, which... made the problem so much worse. Now, even normal letter typing would result in this disconnect. This did seem to affirm my theory, that it had something to do with the USB port, though.

Next, I tried to clean the solder joints on the USB port, in preparation for reflowing it again, since the joints were covered in burnt flux by the point. On a whim, I tried plugging it in again...

...and it worked, seemingly perfectly. I can sometimes force a disconnect if I slam the right part with both hands, but for normal typing it seems completely fine.

Conclusions