Rich Interaction Document




The document format is a proposal for an approach to a text format not just acknowledges but supports the fact that there is more behind the text than what the “black lines on the white background” shows – the lines on the screen, the typography – is only the tip of the iceberg. This means that when you edit a document and cut and paste you will not ‘break’ the hidden metatags/markup.


The document format will be open and developed collaboratively.


Core requirements include the needs for richly embedded data and robustness


This type of document will lay the foundation for continual development of deeper literacies.



Project Collaborators


The development of this project is done from the Web And Internet Science (WAIS ) Group at the University of Southampton. Potential collaborators are reading this page now and should email me to be added:





The goal is to produce a robust document format which supports rich interaction not only in one host application but between any applications now and in the future.


The Liquid | Author word processing application will function as a real-world test-bed for the format.



The Importance of Meta-Information


I believe that the addition of further ‘meta’-information, also referred to as ‘tags’ is integral to the text, not something ‘tagged’ onto it, as I will explain.


It is not enough to have the written word. We can also benefit from knowing who contributed to what section in a collaborative document, where text originated from, what different people feel about it and more, much more and this can lead to really amazing innovation:



Open Innovation With Views Support


We need to develop an open (and free to use) document format with richness and robustness where different approaches for dealing with the text and other data can be developed and the end user can take the document from one application, such as a traditional linear word processor to another, completely different application, such as a 3D mind-map application with multi-user interaction and back again for further work and then back again – without loosing any relevant meta-information – such as who ‘wrote what’ and where sections of the document should be displayed in 3D space…


Such a new approach is important since we will require large amounts of additional information to accompany richly interactive documents as we develop ever more powerful ways to interact with the text.






The Problem With Current Approach – Expanded




What is usually stored or packed with the text is formatting which modifies the look of the font, such as making it bold, italic, a different size, color and so on. This is done in variations of what you know from HTML: If you want something in Bold you need to specify that it should be in Bold by putting the word Bold inside hard brackets and specifying where it should end, by putting ‘\’ after the next mention of Bold, like this: <Bold>Bold Text</Bold>. It’s pretty straight forward and as you likely know, this is way of marking up the text uses what’s referred to as ‘tags’..


So here is the thing, let’s start by agreeing that adding the typeface to the text could be a useful thing and indeed, this is possible in HTML. Does that mean anything other than symbolically? No. But it can open up a new world:


Marking Meaning, Not Just Fiddling With Formatting


The new world comes with new a new respect for meaning-adding tags, the tags which explains what the text is all about, tags which provide context. This could be as simple as tagging something as being ‘important’ or tagging someone’s name in Facebook.


However, just adding g more and more tags to the text poses problems as well as opening up new possibilities:


The Problem with Markup


The Problem with adding more and more tags is that since there is more and more happening below the ‘water level’, things can go wrong when you interact with the text, through copying and pasting it for example, where you may end up not copying the full markup or you may end up inheriting markup when you paste. It can get messy quickly. Anyone who has ever copied and pasted and had their text change formatting knows this well. The problem is so widespread that you can actually choose to paste without formatting, inheriting what formatting is already on the line you are pasting in.




A Proposal For An Approach


As will be described further below, the main notion is to develop a system where once text is tagged, the text cannot be edited arbitrarily – the user must respect the spans of the tags to cut, copy and otherwise move or edit the text. Think of the text as being the kind of tags you see in Facebook or in other services where you type in a name and the name is no longer editable, since that would no longer make the tag valid.


How this interaction would occur would differ from application to application but the basic principles would be observed everywhere, such as it’s not possible to edit a quotation without changing the citation behind it.


A Proposal For A Solution


A solution can be changing the way we think of text by moving away from the approach that text is just speech made visible through lines on a substrate (after all, you can say a letter without saying what font it is, but then the one of your voice etc. comes into play, let’s not go there, it’s not a very useful philosophical avenue). The solution comes when we start accepting the powerful potential of thinking about the visual marks as simply the tip of the iceberg where there is a lot going on underneath – the history of the text, contextual information and more, stored frozen, encapsulating the text so that it sticks with the text.


This means boundaries become dependent on where the tags span and is no longer arbitrary in-between any characters.


If we don’t take this approach, of letting selections be determined by tags and not by characters, then making such a system robust would be difficult since editing such text would be slightly more complicated than it currently is, since the user would have to be constrained in how the text could be cut and moved and so forth – an extra empty space or including or not including a punctuation mark (which is outside or inside the tagged capsule) could spell formatting mess.


This would mean that trying to edit text which was copied across as a citation, would provide a choice for the user to break the citation or write elsewhere.


On the Rehabilitation of Modes


It’s long been fashionable at Apple (and in other computer companies I would guess) to have mode-less work environments but the reality is that sometimes you can benefit from having different modes to access and create information, to the point where usually the modes are actually there, but hidden by having different applications for each mode, such as a web browser and a web editor, a video viewer and a video editing application and so on.


To be clear, what I am proposing is that when a user edits Liquid text, selections are guided and snap-to the capsule wrapping determined by the span of the tags rather than the marks of the text and that it’s easy for the writer/editor to open up the text to see all the constituent tags, which can then be edited and/or deleted.


A reader would have similar control, to see and manipulate the tags, but not to edit the tags nor, of course, the document.




The benefits of treating text as deep information objects encapsulated by tag spans will let us more deeply and clearly keep connections for cited text, through what I would like to call the liquid environment between the .liquid documents.


Much automatic tags can then be added, such as where and when any by whom text is written and in response to what, without fear of it becoming messy and this can help the reader organize and analyze the text later.


Sentences could be actively made to be context aware, letting the author specify dynamic, automatic changes based on external circumstance (or internal) such as when a specific data changes the text updated grammar from future tense to past tense.: “I will do this in 2015” – “I planned to do this in 2015”.


Text edited by different people could be clearly marked as such.


Since this approach would let data be imported on typing, such as the Info Box in Wikipedia, the document would have tagged the birth dates and such for all the people in the document and the user could then choose to view the document chronologically if needed.


When pasting text from another source, the text would not only be pasted with the appropriate citation history but the text would also look like a citation, complete with italic or quotes – whichever the users reader app shows.


Code and maths should also be embeddable so that basic experiments can be done in-document, through a similar logic to HTML iFrames.


Application Control


The user should be able to change any of the visual aspects,  such as what a heading looks like or how a citation should be shown. The applications should not force the interactions, simply provide opportunities, though the default interactions need to be carefully considered and tested.




The standard will need to be robust so that data does not get mangled, within an application or when data is moved from one application to another.


While the main concern will be text, multimedia objects will also need to be taken into account.





We need to discuss the technical implementation. Should we base it on HTML/XML/Other?




The important thing about implementing .liquid is that the interactions and the visual style it will work, is entirely up to the specific application.





If we stay within our current paradigm of WYSIAYG, or What You See Is All You Get we will be operating in the shallowest of evolutionary environments.


If we embrace text as something much richer, much deeper, if we embrace text as inherent connections, if we allow text to be a portal to much more, then the opportunities to evolve ever more powerful text systems opens up tremendously.


We lost some colour and charm when we moved from ink on paper. Let’s take the massive leap of going beyond text’s visual first impression.