My Accessibility Stack and the future on Wayland

This is an article quite some time in the making. I’ve written 3 or 4 drafts of it over the last 4 months, looking for just the right thing to say. After I wrote the initial draft of this one, it sat in the drafts for another 2 months before I really finished it. In the end, I’ve decided that being straightforward is the best way to go, so here it is:

As the Linux Desktop transitions to a Wayland-only future, I will be locked out of my computer, as the accessibility software I rely on is left behind.

The desktop environment I use, KDE Plasma, has announced that in early 2027, X11 support will be removed from the system. That means in about roughly 9 months, I will no longer be welcome on that desktop environment, being forced to cling to an older version or switch to a more niche environment that still supports it.

Why?

The Wayland desktop has been making great strides in accessibility recently, right? Well, there’s a subset of accessibility that absolutely no one is talking about, and that’s what I’m here to fix today:

Input devices.

Most of the discussion about accessibility refers to output, for users who have limited vision, or are blind. The series of articles written by Fireborn last year discuss the myriad problems that exist while trying to use the Linux desktop while blind, and likewise GNOME has been devoting their attention to, for instance, supporting AccessKit in their applications, which helps screen readers such as Orca render its contents via text to speech.

But accessibility cuts both ways, and it’s equally valid for people to have trouble conveying input to their systems. Such is the case for me, as last year, after a gradual but steady period of decline I was diagnosed with Ehlers-Danlos Syndrome, a musculoskeletal genetic defect which wreaks all sorts of havoc on your body. An earlier draft of this article included my whole, nasty journey of getting diagnosed and treated for such a rare (and often misdiagnosed) condition, but it’s really not that important– What matters is that basically what it did to me in particular was destroy all those important little muscles in the wrist that let you flex your fingers, say, in order to use a keyboard or work a mouse.

Thanks to months of intensive physical therapy with a specialist in hypermobility disorders, I’ve regained partial use of my hands; depending on the day, I can get through maybe a few hours of typing with a specialized keyboard.

(As an aside, the fact that I’ve regained ANY use is something of a miracle, largely owing to good location relative to specialists, liberal medical leave policy in Massachusetts, and my undeniable position of privilege within the horrifically messed up U.S. healthcare system. There are many worlds in which I wind up permanently disabled and never even find out why.)

But regrowing a large set of small muscles that all withered away is terribly slow, very painful, and probably imperfect; I may never truly regain full use. The progress I’ve made doesn’t get me through a full workday, let alone weeks on end; I need another way to get through my life and career.

Enter Talon Voice.

Talon Voice

With possibly one of the most understated landing pages in existence, Talon doesn’t do much at first glance to convey that it’s probably the most powerful hands free input system ever created. Talon is a deeply, thoughtfully crafted core of a hyper-fast and accurate Speech-To-Text ML model, a bespoke scripting language, and Python, all working together in concert to enable nearly infinite extensibility for users to craft their own hands-free means of communication to their applications, either working with the application… Or against it. (That notion of “adversarial accessibility” is something that comes up quite frequently!)

The community series of scripts is the first thing to install to make Talon useful, and boy, is it a whopper. There’s tens of thousands of lines of code in here, all carefully hand-written by individuals to meet their specific needs and conglomerated into a whole so that others can benefit from their work.

With Talon, I can do things like:

  • Focus applications, saving me the mouse movement and clicks necessary to select them from the taskbar;
  • Write text, using Dictation Mode (most of this article was written with talon);
  • Interact with my browser, using the Rango extension for browsers, completely hands free (which happens to be faster than traditionally moving around with a mouse anyhow)
  • Write my own script to call out over D-Bus to an external speech to text program for times when I’m writing longer prose (Whisper-v3-large is a truly remarkable model, understanding things like proper nouns it’s never heard of before, though it isn’t quite fast)
  • Make a hissing noise to scroll, which is a motion that is persistently painful for me regardless of what input device I try using (though in the future, I may try to integrate a foot petal with Talon)

The list goes on and on and on, but I’ve saved the best two extensions of what I’ll bring up here for last:

gaze_ocr

gaze_ocr is an unbelievably cool extension that lets you directly control your screen using OCR.

Using an OCR backend (one is not provided on Linux, but I was able to plug in RapidOCR) gaze_ocr will directly read the contents of your screen, allowing you to directly click on any object. Using an eye tracker, it will even disambiguate the text on the screen depending on what you’re literally looking at.

I really cannot do this justice without a video, so I strongly encourage you to watch the sixty second intro here: https://youtu.be/qkFy66WF3bU

Suffice to say that this alone makes life so cool. There is zero integration required on behalf of any of the applications involved– yet I can interact with them anyway. Adversarial accessibility at its finest! I feel like I’m in a scifi movie whenever I use this package. But it isn’t even the most powerful extension I use…

Cursorless

Man, this is so cool.

The best way to explain Cursorless is with a video demonstration, which the main website provides. If that isn’t your jam, Xe Iaso wrote this pretty great demonstration in text form a few years back.

For those interested in ingesting neither of those, let me give a TL;DR– Cursorless is an extension for visual studio code that builds a syntax tree aware representation of the source code, then lets you refer to tokens by hats drawn above them. Combine this. with the community repo’s inbuilt support for writing dozens of different programming languages with your voice, and you have a powerful means to write code, completely hands free!

I never “got” Vim. Yes, I know how to use it, but it just didn’t “click” for me; but I never understood the more advanced systems like buffers, macros, and crazy actions to blast around your code base. Cursorless, on the other hand, is actually pretty intuitive, though it requires a learning curve. For instance, if I want to jump to the start of this sentence, I observe that the hat above “Cursorless” is colored pink and above the ‘C’, so I say: “pre pink cap”, referring to that character by Talon’s phonetic alphabet.

Cursorless empowers me to write in a way that no other editor ever has, keyboard or voice based. I love it so much I’ve even been writing bespoke implementations of it for software I use at work.

Really, that’s what ALL of Talon does– it empowers me to interact with my computer, not just in spite of my disability, but in a truly novel and more powerful way. In a way, I’m grateful to be on the cutting edge of human input systems; I certainly never pass up an opportunity to show off Cursorless when I can.

That is what the Wayland-only future will strip me of.

OK, so why is it broken on Wayland?

If you’re a Linux desktop user, you’ve probably at least heard of the transition to Wayland, away from the 40+ year old X11 standard. The important thing is that absolutely no one wants to touch that garbage fire of a codebase anymore, and Wayland IS the future the FOSS desktop community has decided on.

But they left a lot of important things behind, and have spent decades scrambling to catch up.

Talon requires deep integration with the window manager and compositor to carry out even the most basic of its duties, and Wayland offers… Absolutely no way to perform any of those actions.

The most basic task at hand– inputting text into the system– seems to simply be impossible in a truly “Wayland” way. Last year, the maintainer of xdotool, the defacto input automation tool for X11, investigated how to do this and walked away confused. This is to say nothing of the more advanced APIs needed for window management, mouse positioning, clipboard management, screen reading, and so on and so forth… Wayland, as an ecosystem, supports NONE of this.

“There’s a way to do it in GNOME, but not KDE”, or “Yeah, wlroots implemented this ages ago” is not an answer to this problem. Linux is already a niche ecosystem (despite how much we all want that to not be true) and to further ask cross-platform developers to write implementations of their deeply-integrated system for 3 or more different compositors (what about Niri, which uses the Smithay compositor??) is entirely unreasonable. Oh, and by the way, not a single one of those compositors implement the entire API surface you need.

And so it’s all going away

Frustrated by the endless lack of progress towards a real set of solutions for the entire ecosystem, and inundated by an endless series of requests for Wayland support which he cannot provide, Aegis, the main (and only) developer of Talon, has made a declaration: Enough. Talon Voice will imminently remove ALL Linux support from the public release, as X11 continues to sunset and users are switched to an environment in which their system can no longer function, with no option to go back.

Talon is divided into a free and paid tier. The paid tier will, for now, retain X11 support; it’s more about relieving the endless burden of free-tier users installing the software and being surprised when it doesn’t work.

This is not a desirable outcome for anyone. It’s also a declaration made not out of spite, but pragmatism; without any means to support the 2027 Linux desktop, the only viable answer is to… Remove support for the Linux desktop.

The Tasking

Aegis has requested that those of us who wish to not see Talon on Linux die out do the following:

  • Do not, for any reason, discuss Wayland support with him;
  • As a community, gather together, and successfully implement the entire API surface needed for Talon on GNOME, KDE, and wlroots,

At which point a new Wayland backend will be considered for Talon.

As a community, we’ve struggled with how to approach this, which is one of the reasons why writing this article has been so difficult and taken so long. Wayland as an ecosystem is… Known to be, shall we say, unkind to problems that its users face. The most recent in a long, long line of examples is the xdg-session-management protocol, which took a whopping six years from the initial pull request to completion. The runner up is the also recently merged ext-zones protocol, which took over two years to merge, not counting the multiple years before that of R&D to develop the initial concept.

When I first started writing this article, I believed the following: “This seems like a problem where maintainers are simply not aware of our needs; seeing as absolutely no one is talking about this, it’s impossible to implement a solution for a problem that’s not being discussed.”

It now seems more accurate that “no one is talking about this” is basically a self fulfilling prophecy. As preliminary research, I read a great deal of mailing lists, most of them from two to three years ago. This seems to be the last time a serious effort took place to contact maintainers about the issues we face.

And what I read was downright ugly.

I will spare readers the furthest reaches of the many things I read, but two responses in particular jumped out at me–

  • This response from Nate Graham, which discusses the practical reality that contributing upstream to Wayland is mandatory as an application developer,
  • And this response from a GTK maintainer in a thread discussing Talon (which is not mentioned by name) who refuses to engage with the discussion, refers to us as “accessibility maximalists”, and refers to resources that are not topical to the point at hand (again, this is about input, not output).

Reading these two answers, many other threads like it, and the general state of wayland-protocols was and is super demoralizing to me, someone who believed that some extra communication was all we needed. What it tells me is that Wayland as an ecosystem demands our participation, yet makes said participation either impossible (“oh, you’re an accessibility maximalist, I don’t need to listen to you”) or requiring multiple years of full-time work to move the needle ever so slightly forward (see: any major Wayland protocol written in the last five years).

So now I think this is the reason no one is talking about this; It just seems like there’s no point in even trying.

I can’t imagine how many of these types of discussions Aegis had behind the scenes; no wonder he sees the Linux desktop as a lost cause and has delegated any chance of salvaging it to the community.

Now what?

I don’t know.

The objective of this post was to clearly outline my accessibility needs, how Talon fills those needs, and why the Wayland-only future will leave those needs unmet.

I love the Linux desktop and do not want to be forced to leave it. I like my cozy Plasma DE, support for every game I want to play, development tools I need, cutting edge tools for new hardware blocks; an experience that’s clean, all free of ads and bad UI redesigns and AI injected into every corner.

I guess the real reason I’m still writing this is as fireborn said last year: Wayland is growing up, and now we have no choice but to try and contribute.

So maybe there IS someone out there who knows what to do to move forward. This article is written with the hopes of reaching those people. Can you help us?

Epilogue

Since I started using this technology six months ago, I’ve grown to love it. Of course, I would be happier if I could just use a traditional keyboard and mouse and didn’t have all these wacky health issues, but given the hand I’m dealt (or rather, lack thereof) I’ve been able to get along just fine!

Having to think outside the box of human input devices has opened my eyes to just how broad the possibilities are, and how restricting the keyboard and mouse paradigm is when we have modern computers that are so powerful you can use a machine learning model to scan every single character on the screen, track which of those characters you’re looking at with your eyes, and directly click on that text, within a second. Just to name an extreme example.

In this way I’m grateful to be on the cutting edge of HID. And what a time to be at its frontier! I’ll have another article out about my special keyboard, the Svalboard, soon, which didn’t even exist three years ago. Talon keeps getting better all the time, there’s a zillion commands already available for it, and the means to write your own easily if those don’t suffice.

I hope everyone reading this never has to live with the health issues I do– but I also hope you’re encouraged to take a look at what’s possible yourself. There are new and better ways out there!


Posted

in

by

Tags:

Comments

Leave a Reply

Your email address will not be published. Required fields are marked *