Spoken Mathematics on The Web

DAVID TSENG: Hi everyone, my name is David Tseng. I am a software engineer on the accessibility team here at Google. And I have here with me my colleague, Volker Sorge, who is a visiting professor from the University of Birmingham. We're here today to talk to you about ChromeVox, and some of the exciting work we've been doing to make reading of mathematics on the web possible. To tell you more about this, I'm going to hand it off to Volker for a brief introduction to the work. And I'll then come back for a brief introduction to ChromeVox, in general. Volker. VOLKER SORGE: Thank you, David. So my name is Volker Sorge. I'm a senior lecturer at the University of Birmingham in the UK. And I've been spending the last year, together with the ChromeVox team here, to make an effort in order to make scientific material that's out of the web accessible for people with visual impairments. And first, we have concentrated on making mathematics accessible, using ChromeVox. Now why is that important? Well, one of the problems with scientific material is that it is a hurdle in education that many people that have visual independents are excluded from having scientific material readily available as their normal peers would have. Because it is much harder to make scientific material fully accessible. And particularly, if you think about something like mathematics, it is not as straightforward as reading just regular texts. Therefore, it is often very difficult to make it accessible using your average screen reader. Now there have been a number of efforts in order to make, in particular, mathematics accessible, that are on the web. But these are generally efforts that require either particular, special tools or particular plug-ins for special browsers. So what we have been trying to do with ChromeVox is make mathematics accessible in just the standard screen reader-- so making, effectively, mathematics a first class citizen in the world of accessibility. And we have done, also, in addition, some special elements, like, in particular, exploration of mathematics, that we believe are fairly novel, and that we're going to demonstrate to you in the upcoming hour. So first, we'll start with a general introduction to what ChromeVox actually is, before we then go into more detail on how one can use it to speak mathematics, and also, how one can, in particular, use it to customize how mathematics is being spoke. And for the general introduction, that's going to be done by David. David, please. DAVID TSENG: Thank you, Volker. So just to get everybody up to speed and on the same page, I'm going to give a demo of ChromeVox, and its general features and functionality as a general screen reader. So for the past, I'd say, three or so years, we've been developing ChromeVox. And ChromeVox is just a Chrome extension. So it's all written in JavaScript. And we have access to all the information that any other web page would. And this gives us a lot of things that ordinary screen readers don't get, and everything that they do get. So we've been able to create something that, I think, is fully featured, and allows a blind user to navigate all of the web. So to start off with, I will show everyone a standard page from Wikipedia. This is a square root. The topic for this page is on square roots. And I will start off by simply tabbing around, and letting everyone hear the type of feedback we provide. CHROMEVOX: Navigation-- internal link. This page has one alert. Do you want Google Chrome to save your password? DAVID TSENG: So there, you've heard quite a few things. So it says a lot. It says, first of all-- the first thing that you've heard was "navigation" and "link--" "internal link." So those bits of information are contextual. And it tells you what the currently focused item is. So a user who's using a screen reader can then know how to interact with this. So since it's a link, one can press it, and therefore land in another place on this page, since it's an internal link. And the second bit of feedback you heard was an alert given by the Chrome browser. So we actually speak that, as well. So besides landing on focusable items, we can actually move about the page and move to things that aren't focusable. So you do that by pressing the ChromeVox modifiers, and the arrow keys. So let's hear what that sounds like. CHROMEVOX: Comma. Search-- internal link. Quote square roots quote redirects here. For the music festival, see square roots-- link. DAVID TSENG: And you hear quite a few other things there. There were some static pieces of text. There were some more links. And one other item that is kind of interesting to point out is-- you heard a sound effect, kind of like a "ding" sound. And that actually represents a link. So whatever a user hears that, they automatically know that they can click on it. So some other interesting things are that a user doesn't necessarily have to go item by item. They can actually jump from section to section. And that's done via another keyboard command. And let's hear what that sounds like. CHROMEVOX: Contents-- heading two. DAVID TSENG: So there, I just skipped a lot of material on the page, and was able to jump right to the contents listing. So let's try that again, and see if we find anything else interesting. CHROMEVOX: Properties-- heading two. DAVID TSENG: All right, and keep on going. CHROMEVOX: Left bracket, edit source-- link-- pipe, edit, beta, right bracket-- link. Enlarge the graph of the function f left parenthesis x right parenthesis equals square root of x-- math. DAVID TSENG: And there, we have something that alludes to the rest of the talk. And it's how we actually make it possible to read that expression, and make it possible for ChromeVox, and the user, to then communicate, using something that's a little bit richer and a little bit more unique. So with that, one last thing to show is we can actually read content, continuously, just by pressing another command. And it'll actually read it kind of like a book, from this point on to whenever you want to stop it. CHROMEVOX: Made up of half a parabola-- link-- with a vertical directrix-- link-- dot. The principal square root function f left parenthesis x right parenthesis equals square root of x-- math-- usually just referred to as the quote square root function quote, is a function. DAVID TSENG: And I'll just stop it there. And that, overall, gives you a flavor of what ChromeVox is, and what it does, and how the user would use it, and a little bit of a taste of how math works, and spoken math sounds like. So with that, I will throw it back to Volker. And he will continue on with some of the more interesting things that we're doing with math, and how this is all possible. VOLKER SORGE: Thank you, David. So as you could already hear when David was demonstration the continuous reader on the previous paragraph, there was math contained in the paragraph. And the math was just being spoken out in a fairly verbose way. We'll talk about that a bit later, why this is so verbose. And in order to explain how that is being spoken out, it is probably worthwhile going a bit into detail on how math is actually represented on the web. And in particular, for those who are not fully aware how math on webpages works, or should work, I'll give a quick introduction on that now. So ideally, mathematics on the web is represented by a special markup language called MathML. MathML has been around for some time. And it has evolved and matured to a degree that it's now been made a stand for HTML5, and as a consequence, also for EPUB 3. So in other words, every EPUB 3 reader, in the future, should also be able to deal with MathML. Now the question, of course, arises, why is there a specialist markup language just for mathematics. Well, the reason is that mathematics is fairly involved, in terms of the way expressions are being set up and how they're being structured, which is normally not doable-- or not easily doable-- with just regular HTML. In particular, think about that regular HTML is very good with dealing with text and these things. It can also do sub and superscript. However as soon as things become more involved, as in a mathematics expression-- which is, by nature, effectively two dimensional and very, very nested-- HTML alone can normally not deal with it. Therefore, MathML provides all the basic functionality-- or the basic tags-- in order to create complex mathematics expressions, and display them on the web. Now while it is, in general, a good thing that MathML is fairly mature, and that it's a part of standards, what is a drawback about MathML is that it is not implemented in all browsers, or all EPUB 3 readers. As a consequence, there are several ways of getting mathematics on the web to work nevertheless. So one of the ways it is done by many sites, such as Wikipedia, by default, as well as other pages, is that one actually displays an image of the mathematics instead of having a MathML expression in there. And therefore, it can be easily read by people who can actually see the image. But it cannot necessarily be picked up by a screen reader. Now the advantage of this is, sometimes, that the real mathematics is still provided somewhere in the background, not visible. And therefore, it can indeed be used by screen reader. And I'll demonstrate it to you later, how that is being done. The other way of displaying mathematics on all browsers on all platforms is by using a little library called MathJax, which is done by the MathJax Consortium. And if you want to check this out, this is at mathjax.org. And the way this works is that MathJax can be injected in any page that contents mathematics. And it's JavaScript. And it just renders the mathematics, which is either given as MathML or a more traditional markup language like LaTeX or ASCIIMath. And it then renders the mathematics. Now the challenge with this process, for a screen reader who actually wants to use the mathematics in order to speak it, is that we actually have to deal with all these types of mathematics as they can occur. So we have to be able to deal with just standard MathML expressions as well as expressions rendered by MathJax, or expressions that are just hidden somewhere, in an alternative text, behind the image. And I'll show you, now, how that actually works, in practice. Right, so let me just quickly demonstrate how this is then spoken by ChromeVox. So let's go to the first math expression, again, in this paragraph. CHROMEVOX: f left parenthesis x right parenthesis equals square root of x-- math. VOLKER SORGE: As you can see, the first maths expression here, in the DOM, is actually the math expression which is in the caption of this particular image. But it still spoke it. And it went to it. And you can see it's been indicated by this highlighter, which is, for instance, particularly helpful if you need to see where you are in the text-- so for instance, if you are dyslexic. And for dyslexic students, this is useful. And we'll just go to the next maths expression, which should now be this one in the paragraph. CHROMEVOX: --left parenthesis x right parenthesis equals square root of x-- math. VOLKER SORGE: Right. Now as I said, this works, now, nicely. Because I'm actually logged into Wikipedia. And if I were to now log out of the Wikipedia page, I then would get it just as an image. And it would still work. Because we can use the alternative text, which is in the background, which is actually the LaTeX expression which you saw of the beginning before the MathJax was rendering. Now I'm not going to log out of this Wikipedia page, because we will need it later. However, let's go to a different page, where we have a similar effect. CHROMEVOX: Sum from Wolfram Math. VOLKER SORGE: And this is Mathworld, which is a particular encyclopedia, just for mathematics. And what happens there is that you have the mathematics given as an image only, always. However, you also have alternative text. And in that case, it's ASCIIMath. And we can leverage this ASCIIMath in order to pronounce it. And just to show you that the math expressions here are indeed images, I'll just grab it quickly, and I'll move them around. See, this is an image. And I can take this sum, and just move it around on the screen here. And all these are images. CHROMEVOX: A sum is the result of an addition. For example, adding 1, 2, 3, and 4 gives the sum 10, written 1 plus 2 plus 3 plus 4 equals 10. Full stop. Math. VOLKER SORGE: So you could hear, now, that this is, indeed, a math expression. Although it was an image, it's being fully pronounced because we can leverage the alternative text. And you could hear, also, why it was a math expression, because it was specially announced using an earcon as well as, in the end, an announcement that this is a math expression. Similarly, we can go to the next math expression, for instance, and-- CHROMEVOX: k equals 1 under and 4 over n-ary summation. k equals 10-- full stop-- math. VOLKER SORGE: So again, this is the maths expression how it's being spoken at the moment. And this now gives us the possibility to have math spoken where it's on Wikipedia, even without logging in, as well as on MathWorld, as well as on many other sites where there's a similar effect. So for instance, there's many math blocks out there. So one of the block so I'm going to demonstrate here is a famous block from a famous mathematician, Terry Tao, who is a Fields medalist winner. And he regularly blogs on his latest mathematical research. And he does it in a similar vein to what we've just seen on the MathWorld site, that the mathematics is actually given in images. And I'll grab one of those images here, and move it around, so you can see that this is an image. So this is an image. Here's another one. But the alternative text in the background is indeed LaTeX. And therefore, we can actually speak it, using ChromeVox. So let's just speak this first definition, for instance. CHROMEVOX: Definition one-- multiple dense divisibility-- let y greater than or equal to 1-- math. For each natural number k greater than or equal to 0-- math-- we define a notion of k-- math-- tuply y-- math-- dense divisibility recursively, as follows-- list with two items-- every natural number-- list item-- n-- math-- is 0-- math-- tuply y-- math-- densely divisible. If-- list item-- k greater than or equal to 1-- math-- and-- VOLKER SORGE: Right, I'll stop it here. So what you could hear was that all the math in the paragraph has been spoken. Although it's as images, we used the alternative texts in order to translate it with the MathJax library. It's the same library we saw, previously, rendering the math on the Wikipedia page. When it comes back from the MathJax rendering, we can then use this representation in order to speak it. And however we do this in the background, in that sense, none of the content in the page is actually, physically changed. So it still has the same visual appearance as before. What you could also hear, when we were speaking the maths, is that it was always giving a little earcon, and the math announcement, which gives you the possibility to actually find out where the math is. And this is, in particular, important if you want to have more interaction with the math. And with this, I'll pass back to David, who's now going to explain to you one of the specialities of ChromeVox which allows you to actually interactively explore a mathematics expression, which we believe is a very important step in order for people to fully engage with mathematical content that's on the web. DAVID TSENG: I think everybody can identify with me, personally, in that hearing all of this is kind of a lot to take in. I mean, if you consider the often quoted figure of 7 plus or minus 2 bits of information that we, as humans, can store and keep in our short term memory, what we just heard was way more than seven words. So what do we do about that? Well, speaking personally, I, in the days when I was in high school and in university, used several methods to actually access mathematics. One primary way, and one that's pretty popular still, is to use audio books. And at the time, we actually used these four-track audio tapes-- so a tape which has two sides, using both stereo channels, splitting those into mono, and having each channel record audio tracks. And someone-- a really, really nice volunteer-- would sit down in a room and read through an entire math textbook. So this included anything from algebra all the way up to calculus and even beyond. I remember listening to a real analysis book in college, which was once in a lifetime experience. So it was really, really difficult. And there was a lot of rewinding, and pausing, and taking notes. So we feel like ChromeVox can do a lot better, since we have a computer, and we have the ability to write code to make things easier. So what have we done, exactly? Well, think about any expression-- and I'll demo this, briefly, later on. But just think about something that you've seen in the past, say, the quadratic formula. How would you actually read that? It's not exactly obvious. And even harder is then to tell a person who's never seen it before where all the pieces lie. So if you think of it like geography-- where are all the countries? And where are all the large pieces? Where are all the continents? And if you want to dive in to a specific region, well, what's inside there? And how do you actually give a quick summary of all of these things? And how do you actually let a person then ask you to describe something further? So this is the challenge that ChromeVox has faced. And we feel like we've come up with a pretty good solution. So let me go ahead and just demo that for you now. We can hear how does it now. And I can show you how a person would actually explore the various pieces of it. CHROMEVOX: We have x equals minus b plus/minus square root of b square minus 4ac divided by 2a-- math. DAVID TSENG: Now you see immediately notice quite a few things. There were a lot of words, for one thing. We also slowed down the speech rate quite a bit so that we can do some more intelligent things like speeding up certain parts, slowing down others, inserting pauses, changing pitch. So one thing to add there is, you'll notice a little bit of pitch drift. So all the stuff in the numerator, you'll hear go up a little bit. All the stuff in the denominator, you'll hear go down a little bit. In that power, the b squared, the squared part is actually up a little bit, even more, in pitch. So it gives you another dimension to work with. So pitch gives me, as someone who is the listener, an additional hint as to how high something, or low something, is vertically represented. But this is still a pretty beefy example. If you remember your first algebra class, it's a lot to take in for someone who is seeing this for the first time. So how do we actually let someone explore it? Well, we have one of these other ChromeVox commands that lets you go and dive deeper into the structure. So I'll go ahead and hit that. CHROMEVOX: Entered math-- x equals minus b plus-- DAVID TSENG: And you heard that again. There was a sound icon that represents exploring something. And you heard the expression starting to read again. And I just stopped it, for the sake of brevity. So how do we actually go and figure out what pieces are here? What's the general geography? So there is this concept of granularities, in ChromeVox, that lets you zoom in and zoom out, essentially, of something on the page. So I'll go ahead and press that. And let's hear how that sounds. CHROMEVOX: Down to level one-- x-- DAVID TSENG: So I just moved down a level. And I'm in a bigger chunk not as big as the whole expression. So what is this? This is x. And it's obviously very important, so we let the user hear that in isolation. And we have some keys to move you around. And those are basically to move you forward and backward on the current level of things. So let's hear what that sounds like. And I will move forward. CHROMEVOX: --equals-- DAVID TSENG: So "equals" is obviously also very important in this equation. So what's the next thing? And I will go ahead and press Next again. CHROMEVOX: --minus b plus/minus square root of b square minus 4ac divided by 2a. DAVID TSENG: And you heard, again, that whole, big fraction. And you heard the pitch changes, and everything else. And I will press the Next key again. CHROMEVOX: Minus b DAVID TSENG: Oup, and you heard a little "ch-ch" sound. And that represents that we've bumped against an edge. So there's only three big chunks in this thing. And as a user that, perhaps, has never seen this formula before, well, I now have a really good sense of all the big pieces, and all the big players, here. But say I, still, am a little bit hazy on this fraction, and I want to look a little bit more. I can actually dive in even deeper. So let's do that, and hear how that sounds. CHROMEVOX: Down to level 2-- minus b plus/minus square root of b square minus 4ac DAVID TSENG: And you heard the numerator. So that sounds great. And let's move forward. CHROMEVOX: 2a DAVID TSENG: And that's the denominator. So that all makes a lot of sense. And let's try moving forward again. CHROMEVOX: 2a DAVID TSENG: Oup, bounced against an edge. Move back. CHROMEVOX: Minus b plus DAVID TSENG: OK. We heard that before. Move back again. CHROMEVOX: Minus b DAVID TSENG: Oup, another edge. All right. So we've gotten a really clear sense of where everything is, again. So I can even dive even further. And let's do that, real quick, into this numerator. CHROMEVOX: Down to level 3-- minus DAVID TSENG: OK. Minus CHROMEVOX: Minus DAVID TSENG: Oup, that's the beginning. So let's move forward, instead of backward. CHROMEVOX: B plus/minus DAVID TSENG: Forward CHROMEVOX: Square root of b square minus 4ac. Square root of-- DAVID TSENG: --oup, and that's the end CHROMEVOX: --b square minus 4ac. DAVID TSENG: And if we really wanted to, we can even dive into the square root. So let's do that. CHROMEVOX: Down to level 4-- b minus 4ac. DAVID TSENG: And that's it. So with that, it is our hope that students will be able to then really easily explore an equation like this. That's now becoming possible with what we're doing with ChromeVox. And in the past, you would have had to rewind and fast forward through a tape, and do it that way, or perhaps, if you're really lucky, get math through braille. But with the possibility of using online math, now you can get it in real time, and look up a wealth of information on the web. So with that, I'll pass back to Volker, who will discuss some of the intricacies of actually applying some of these text to speech changes, and the way we speak, and some cool stuff that you, as a developer, can actually do to improve any math that you might come across. VOLKER SORGE: And this, as you have previously mentioned earlier, at the moment, what we've demonstrated to you, the way things were spoken, is very, very verbose in the sense that every single element is indeed being spoken out. Now that is not necessarily how mathematics is being spoken in real life. All right. Often, you omit things, although they are written. And this is something we can-- I'm going to demonstrate a few examples where things are being very awkward when everything is being spoken, and where we then apply changes to our underlying structure and rule base in order to have them spoken more intelligently. And I will then, afterwards, tell you how this is actually being done, and how that can be customized by users, themselves, or by creators of web pages. Right. So this is, here, a Wikipedia page with a matrix example. So matrixes are particularly difficult to speak, because they are fully two dimensional objects. And I'll show you, at first, how we could speak it in a full, verbose fashion. CHROMEVOX: Left square bracket matrix-- element one, one-- 1. Element two, one-- 9. Element three, one-- minus 13. Element one, two-- 20. Element two, two-- 5. Element three, two-- minus 6. Right square bracket-- full stop-- math. VOLKER SORGE: Right, so as you could hear, everything was being spoken. Every bracket, every punctuation is being spoken. Now we have the possibility, in ChromeVox, to actually change the underlying representation in a way that we can get a bit more intelligence, a bit more semantics in there, if you like. And actually, I'll do that now, in order so you can see how this is being spoken then. CHROMEVOX: Matrix-- row one, column one-- 1. Column two-- 9. Column three-- negative 13. Row two, column one-- 20. Column two-- 5. Column three-- negative 6. Math. VOLKER SORGE: Right, so as you can see here, ChromeVox is slightly more intelligent. It applies a slightly better way of pronouncing rows and columns. In particular, it omits some of the punctuation which is not strictly necessary for a reader to know, exactly. Whether this is a round bracket or a square bracket, in this particular instance, might not make any big difference. Right. These things can get particularly worse when the underlying representation is very similar, however the meaning is particularly different. So let's go back to our square root example. And I'll read you the first example here, in the normal, verbose style. Let's go to this case statement down here, on the page. CHROMEVOX: Square root of x square equals-- vertical line-- x-- vertical line-- equals-- left curly bracket, matrix element one, one-- x comma-- element two, one-- if x greater than or equal to 0-- element one, two-- minus x comma-- element two, two-- if x less than 0. Full stop-- math. VOLKER SORGE: So what you could hear, now, was that it not only spoke everything in a very verbose way-- all the vertical bars, as well as the opening brace-- but ChromeVox also fell into the trap that it saw something it thought might look like it's a matrix. Because it's exactly the same MathML structure, underneath, as the matrix on the previous page, which I have demonstrated. Now if we apply some slightly more intelligent way of reading this here, we actually get something slightly more sensible. CHROMEVOX: Equation sequence-- square root of x square equal absolute value of x equal-- case statement, case one-- x, if x greater than or equal to 0-- case two-- negative x, if x less than 0. Full stop-- math. VOLKER SORGE: So you could see here that what happened was that now, first of all, ChromeVox has a more semantic interpretation of the expression. It gives you a quick summary at the beginning-- it's a sequence of equations. So there's more than two parts of the equation. And then, it reads it in a more intelligent way, by realizing something is an absolute value, as well as actually finding the case statement correctly, rather than pronouncing it as a matrix, as it has done before. And a similar semantic interpretation can then be applied to all the other statements. So if you go back, for instance, to this statement here, and pronounce it, first, in the old way-- CHROMEVOX: f left parenthesis x right parenthesis equals square root of x-- math. VOLKER SORGE: All right, supply some semantics. CHROMEVOX: f of x equals square root of x-- math. VOLKER SORGE: Right. So as you could see here, instead of pronouncing all the parentheses, we now say "f of x." Because we realize this is, indeed, a function application, and we pronounce it accordingly. Now the way this is being done is by using an underlying speech rule engine, which allows for various ways of customizing the rule base that is being used in order to speak the mathematical expressions. And I'll explain, very briefly, how that works. Effectively, what happens is the math expression is, of course, a tree representation. In particular, it's a MathML tree representation. And this representation is recursively being traversed. And in every node, an applicable rule is being computed. And that rule then has some actions which fire, in the sense that they produce some speech output, as well as produce a way to further recursively traverse the rest of the tree. And these rules can be customized in various ways. One way of customizing them is customizing them with respect to mathematical domain. So what might be useful in one mathematical domain might not be the same in another mathematical domain. And when I say "mathematical domain," what I actually mean is something like algebra, geometry, analysis, calculus, things like this. So you might want to pronounce one expression, in algebra, differently than in geometry. In addition to this, we have the possibilities of customizing rules by their style. And when I say "style" here, I mean something like, do I want it to be spoken short. Do I want it to be spoken in a verbose mode, for instance. Do I want it to be spoken in some particular way that I like, and nobody else likes, et cetera. And finally, we also have the possibility of swapping the underlying representation while the speech engine is active. So that's what I've just showed you. The underlying MathML representation, here, was swapped out with a more semantic enriched representation. Now the problem with this semantic enrichment is, of course, that this is a fairly ad hoc procedure. MathML is a language which primarily aims for display. So there's two types of MathML. One is called Presentation MathML, and one is called Content MathML. The idea of Presentation MathML is that it takes care of how the math expression is being laid out on the page. And this is what we're currently working with. The idea of Content MathML is that an author of a math expression actually puts in some semantics. Unfortunately, that is hardly available on the web. Therefore, what we have to do is we have to take the Presentation MathML expression, and interpret it in a way that seems reasonable to us. Obviously, this can break down in many cases. And therefore, the person who actually knows best how to interpret a MathML expression is generally the author of the expression, themselves. As a consequence, we believe that it is very useful for people to actually be able to customize the speech rules in a way that they will pronounced the MathML expressions they write in the correct way. We therefore, in ChromeVox, offer an API to users out there, to authors of web pages, to developers of EPUB 3 readers that allows them to define their own math rules-- either override or add to our rule base-- in order to have math expressions being spoken in the way they prefer. And the rest of the time I will now spend introducing this particular way of-- these particular rules, as well as demonstrate how they can be written, and what the different components of these rules are. Let's, for this purpose, all go back to our favorite example, the quadratic formula. We'll start manipulating the division in the formula in order to demonstrate how we can change rules. Right. Here, I'll bring up the console quickly. Let me get back to the expression. CHROMEVOX: Given we have x equals minus b plus/minus square root of b square minus 4ac divided by 2a-- math. VOLKER SORGE: Right, so this is the default way, currently, ChromeVox speaks the expression. I will now write a rule here, in the console. I've already written it previously, so I'll just bring it up here, and explain to you the components that will alter to the way the expression is being spoken. Right. So what you can see here, in detail, is that we have-- these function names are the way one can call ControlVox-- sorry, not ControlVox, ChromeVox-- through the API. So here's the ChromeVox API. It's, in particular, the math API. And it offers a defined rule function. And the function here, at the moment, has four components. All of these are string components. The first component is a name. It's practically irrelevant what that name is. The whole idea of a name is that one can more easily find the rule later, somewhere in the speech rule engine, say, for debugging purposes. But for now, we can just ignore this. So we'll just call this "fraction rule." Then, comes some admin information which has to do with the domain, and the style, I've been referring to previously. So the domain was, just to recall this, the idea that we can specialize our rules with respect to different mathematical domains such as algebra, geometry, et cetera. In addition, we always have a default domain, which is the one we're working on right now, which means that if, for a particular domain, no rule is there, we can always fall back to the default rule. And currently, the way domains are being selected is interactively, by the user. Right. So this is what the default, here, says. Then, we have a little dot, which separates the domain name from the style name. And styles-- again, we have styles like short, verbose, or I think we also have super brief. But there is no limit to what you want to call your style. If you want to call your style after your personal name, then we'll have an additional style in ChromeVox, in the future. And again, there is a default style. Here, we're working, now, on the default domain, and the style is short. All right. And then, the rest of the rule is effectively a way how to test whether a rule is applicable, plus the action the rule is supposed to perform. In ChromeVox, this is given the other way around. We first start with the action and then with the precondition, or the part of the rule that tests whether the actual rule is applicable. I'll tell you, in a second, why this is the case. Let's first have a look at this letter part, which talks about the precondition if the rule is applicable. What we want is a node, which is our fraction up here. This division node is given by a MathML tag, which is called mfrac. And we select this using an XPath expression here. So this is a regular XPath expression, which just says, well, the current node-- the node we are on-- is a node with an mfrac tag in the MathML name space. So this is just a regular XPath expression. Then, the rule will fire. And it will perform it's action. So what are the Actions? Well, actions are a sequence of components of different things the rule does. In this case, we only have one compartment. And that component is currently composed of two things-- the type of the component, as well as what is being performed. So the type of the component here is given in square brackets. In this case, the type is t, which means it's just a text that is being spoken. And the text that is being spoken is then given as a string here, in string syntax, with double quotes at the beginning and the end. So when we find the division node, all we want, at this point, is a string being spoken where it says, "some division." Let's put that rule in. And let's see what ChromeVox says now. CHROMEVOX: Given we have x equals some division-- math. VOLKER SORGE: Right. That's all it says-- some division. x equals some division. We haven't specified anything else. In particular, what we have not specified here is how it should deal with any of the other content that's still in the expression. So all we say at this point is, say one string and then stop. Do not recurse any further. I'll show you, in a second, how we can do that, otherwise. Let's first talk a bit more about the precondition. So the precondition here is an XPath expression, as I said, that selects this particular node-- this particular node with the mfrac tag. So it's this division node. Now in addition to this, one can then give additional constraints. And the number of additional constraints is unlimited. Therefore, this is the reason why we put the precondition at the end of this particular rule definition. Because you can add more, and more, and more, and more constraints to it. And I'll add one constraint, just for you to be able to see what this will look like. So when we say, OK, so we want this rule just to be applicable to something that says that it's the node, itself. And it has a descendant, which is an m square root. All right? So that means it has a descendant, which is a square root, up there. And again, this is in the MathML name space. Right, sorry. Unexpected token means I have forgotten to close my expression, my string expression, with a square root. Let's see whether it's applicable. Well, the thing is that we haven't really changed any of the actions. So let's change the actions, so we can actually hear whether the different rule is being applied. "--some division with square root." CHROMEVOX: Given we have x equals some division with square root-- math. VOLKER SORGE: Right. So this has, indeed, worked. And it now applies the other rule. So we have specialized the rule even further. So in addition, we can now add more and more constraints. Anything XPath1 allows you to express, you can express in these constraints. And those are binary constraints. So they only have to hold, or not. Right. So this is as far as I want to explain the preconditions. Now let's examine, a bit further, what we can do in terms of the actions of our rule. In particular, just having one string pronounced, if you actually have a complex expression, is usually not very fruitful. So we might want to have a bit more pronounced. Therefore, let's dive into the expression and pronounce some of the child nodes, actually. Right. So we now start a new component. Components are being separated by semicolons, so that's very similar to what you have in regular JavaScript. And we now want to specify a new component. And this is a particular component which works on single nodes. So therefore, it gets the type n. Again, the type is given in square brackets. And now we can write a selector which is just yet another XPath expression. So we write another XPath selector that selects whatever child we want to work on, or whatever node we want to work on. It doesn't necessarily have to be a child. You can effectively also work on any other node that is reachable from this particular node you're on. But in our case, we want to work on the child. But let's not work on both. Let's just work on the second on, for instance. So this XPath expression, now, just selects, from the given node, the second child node. Right. And what the action now does is it will pronounce the string "some division with square root." And afterwards, it will ask the speech rule engine to recurse on this node that we have selected here. Let's do that again. CHROMEVOX: Given we have x equals some division with square root 2a-- math. VOLKER SORGE: Right. So what you can hear is that we heard the expression up to the division. Then, it would pronounced the string. And then, it would pronounce the second child node. Right. Similarly, we could now put in more content. So say we put in another string of type t. Say this is now a string called a numerator. And we now, for instance, want to recurse on the first node, as well-- on the first child as well, not the first node. Let's do that. Let's listen to what's happening. CHROMEVOX: Given we have x equals some division with square root 2a and numerator minus b plus/minus square root-- VOLKER SORGE: I'll stop it here. We all know what the numerator is. But you can see how this now works. Now one of the things that David pointed out earlier was that you can actually do changes to the way the text to speech engine speaks things from within a rule. And let's do that. The way this works is by adding, in particular, annotations to each component. So for instance, say we want to change the pitch on the first component that's being spoken. Or in our case here, it's the second component of the denominator. And let's say we just want to have a significantly pitch change, say, to 1.5. Let's see how that sounds. CHROMEVOX: Given we have x equals some division with square root 2a and numerator minus b plus-- VOLKER SORGE: So you could hear that the pitch was changed quite drastically when the denominator of the division was being pronounced. So the way we do these annotations is by adding in round brackets, the particular property we want to change, as well as the value that changes it. All right. So in addition, we can, for instance, also increase the rate of the speech. Say we increase it to 2.5. And now let's see what's being spoken now. CHROMEVOX: Given we have x equals some division with square root 2a and numerator minus-- VOLKER SORGE: That was not really-- That was-- CHROMEVOX: Simple school algebra, Volker Sorge, the third April, 2013 VOLKER SORGE: Sorry, that was not much of a change. I'll just refresh the console so we can see a bit more. Let's try to change it to something different. Maybe 0.5 will make a difference. CHROMEVOX: Given we have x equals some division with square root 2a and numerator minus-- VOLKER SORGE: Now you could hear that it was actually quite a bit faster. So what else can we do with this? Well, another thing that we've heard earlier was that we can add pauses to our rule. Let's do that as well. Let's add a pause annotation, say, to the second-- uh, no, to the first of these-- children that we speak. And let's add quite a drastic pause. Pauses are given in milliseconds. So let's add one second, so we can actually hear it more clearly-- so 1,000. CHROMEVOX: Given we have x equals some division with square root 2a and numerator minus b-- VOLKER SORGE: Right. So the effect you could hear here was something I wanted to demonstrate. It is that the pause, as well as all the other annotations-- i.e. the pitch and the rate-- than we give a particular node here, or particular component, are recursively being applied by the speech rule engine. That means if we change the pitch somewhere, this pitch change will be propagated further down in the recursive traversal of the tree. That means is we have nested expressions, and we change the pitch several times, these changes will add up, which gives a very nice effect when it comes to pitch and rate changes. It doesn't really give a very nice effect when it comes to pause changes. Because what that means is that the pause is applied after every single element that occurs. So you might want that. But you don't necessarily want that. So in this particular case, we don't. So how do we actually do it in order to apply the pause between two chunks, rather than having it applied recursively, during traversing the tree? The way this works-- and I'll show it to you here-- is by adding the pause explicitly as we recall a personality annotation, which is another component of the actions. And this component has the type p. So we put that in here. Let's get the bracketing right. And let's see what happens now. CHROMEVOX: Given we have x equals some division with square root 2a and numerator minus b-- VOLKER SORGE: Right. What you could hear now was that the pause was actually only applied straight after the denominator had been pronounced. Right. And in addition to this, we can also, for instance, change the volume. But I'm not going to demonstrate that now, in detail. Instead, I'm going to demonstrate one final thing we can do with these speech rules, and also, one final component-- one type of component-- for the action. And that is useful when you have a node where you do not necessarily know how many children the node has, and you want to recurse over all these nodes. And we'll do that-- give them our division here. Right. And this component is of type m, which just stands for multinode. And you then give it a set of nodes. In our case, we select the set of nodes, again, with an XPath expression, which is just all subnodes-- or all child nodes-- of our mfrac expression. Let's do that. Let's see what happens. CHROMEVOX: Given we have x equals some division with square root minus b plus/minus square root of b square minus 4ac 2a-- math. VOLKER SORGE: Right. So what you could hear now was that all the elements are actually being pronounced properly this time, in order, because this is just the regular order within the MathML expression. Now obviously, for an expression like division, where you really know that you only have two children, this might not make a lot of sense. But if you do not know exactly how many children there are-- say, in the matrix expression we've demonstrated earlier-- you have to use this way off accessing the children. And then you have, in addition, the opportunity, or the possibility, to put in something that is being spoken in between all of these expressions. And the last thing I'll demonstrate to you now is giving what we call a separator string as a further annotation to our component. And the separator is just a regular string, in this case. And let's just say, "divided by," and spell it right, and see what happens now. CHROMEVOX: Given we have x equals some division with square root minus b plus/minus square root of b square minus 4ac divided by 2a-- math. VOLKER SORGE: Right. Back, we are, at the original expression. Right. So this, so far, is the explanation of this API. And why did we go into that much detail in order to demonstrate to you how this API works? Well, the basic idea of ChromeVox is, since it is an open project, that we're hoping that people will exploit this API in the future in order to write explicit speech rules-- say, on their web pages-- to make sure that the math they write, or they put on the web, is pronounced in the right way. And for instance, I personally, as a university lecturer, university professor, could imagine that I'll put some of my lecture notes on the web. And as I know exactly how things have to be pronounced, I also put speech rules there so my students with visual impairments can actually go there, and listen to the math in the correct way, in exactly the way I would speak it in my lecture. Right. And with this, I'll end my part of the presentation and pass back to David, who's going to conclude with some more details on ChromeVox and the project. DAVID TSENG: All right, thank you so much, Volker. I wanted to conclude by thanking everyone for watching, and sticking with us through all of our demos. You can find a little bit more information about ChromeVox on chromevox.com. And we are an open source project. So you can actually even take a peek at all of our code, including all of the things that make math possible. We'd be happy to take any feedback that you have. You can also visit us, in general, for accessibility, at Google, at the address google.com/accessibility. Thanks again for watching. And we hope to receive any feedback that you have. Thanks.