Visual-Meta

 

 

 

Visual-Meta is a method of including meta-information about the document and its contents visibly in the document, as a human and machine readable appendix. It contains citation, addressing and formatting information, functioning in a way as reverse-CSS.

 

 

Enables Scholarly Copy

 

Visual-Meta enables Scholarly Copy by providing a transparently easy way to add full metadata to documents (initially PDF). Proof-of-concept implementations are Author and Reader as shown here in a brief demo: https://youtu.be/rjeEPnPzD6c

 

 

Enables Rich Views

 

Adds rich metadata of the formatting of included elements including headings for instant folding of the document and executing searches with heading elements included in the results, and description of how to parse tables, images and special interactions such as graphs, for dynamic re-creation by reader software

 

 

Not Yet-Another-Standard-Proposal

 

This is not a new format, this is a novel use and extension of the academic-standard BibTeX format with JSON additions.

 

 

Server Surfacing

 

Can provide servers with information about what is in the document in a semantically meaningful way for better extraction of textual and multimedia components.

 

 

Robust

 

As listed below, in the Key Benefits section, Visual-Meta provides robustness and increased usability of meta information.

 

 

ACM Hypertext 2019 Visual-Meta Presentation

 

 

 

Key Benefits : Why Visual Vs Embedded

 

  • Advanced Meta embedded in the document header or package is not directly accessible by end user
  • Easy to Add & Extract. A common complaint about embedded meta is that there is no standard beyond the basics (which are not often employed) and is therefore near-impossible to use at scale. Being based on BibTeX means that a simple copy and paste will add significant added, useful meta
  • Self-explaining standard which requires no technical expertise to add
  • End-User immediate benefit for adding Visual-Meta. End-users who add Visual-Meta to their own or legacy PDFs have the immediate benefit of Scholarly Copy and not being locked into a Reference Manager, making Visual-Meta more adoptable than trying to establish a new header-meta standard.
  • Robust:
    • Can survive document format change
    • Can survive printing out and scanning and OCR and nothing is lost
    • All supported meta can survive document format and operating system updates without becoming unreadable
  • Trivially easy for a human reader to verify
  • Trivially easy to append to legacy documents and to strip if not desired anymore
  • Can handle large amounts of formatting information for reader software to use to reformat and re-present the document as well as provide rich interactions

 

 

Exceptionally Easy To Implement in Current Systems

 

The Visual-Meta is wrapped in the BibTeX format ( so much so that even a simple copy of BibTeX from a download site will comply). It has three logical sections of Citation information, Addressing Information, Formatting information and Provenance information. Usually the page with Visual-Meta has the text 'Visual-Meta' in the normal top level heading as the main document, but this is for elegance only, it is not technically necessary. the actual Visual-Meta starts with @{visual-meta-start} and ends with @{visual-meta-end} where the closing tag is crucial since the document will be parsed back to front for efficiency (since the Visual-Meta is appended to the end of the document).

 

 

 

Visual-Meta Fields, with Examples

 

  • Citation Information

 

This is what the Visual-Meta for the Visual-Meta ACM article looks like. Please note, font and size does not matter:

 

author = {Hegland, Frode},

title = {Visual-Meta: An Approach to Surfacing Metadata},

booktitle = {Proceedings of the 2Nd International Workshop on Human Factors in Hypertext}, series = {HUMAN '19},

year = {2019},

isbn = {978-1-4503-6899-5},

location = {Hof, Germany},

pages = {31--33},

numpages = {3},

url = {http://doi.acm.org/10.1145/3345509.3349281},

doi = {10.1145/3345509.3349281},

acmid = {3349281},

publisher = {ACM},

address = {New York, NY, USA},

keywords = {archiving, bibtex citing, citations, engelbart, future, glossary, hypertext, meta,

metadata, ohs, pdf, rfc, text}, }

@{visual-meta-end}

 

  • Addressing Information

 

Addressing information (shown as currently implemented in the Citation section above) will be the usual citation information and will have scope to be augmented with high resolution linking to web pages, blogs in particular and in-PDF sections and robust multi-addressing. This is ongoing work.

 

  • Formatting Information

 

The formatting specification is implemented as custom fields, which can include anything the authoring software can describe, for extraction into interactive systems:

 

General Formatting:

 

formatting = { heading level 1 = {Helvetica, 22pt, bold}, heading level 2 = {Helvetica, 18, bold}, body = {Times, 12pt}, image captions = {‘Times, l4, italic, align centre} },

 

Citation Formatting, to allow reader application to display citations in any style:

 

citations = { inline = {superscript number}, section name = {References}, section format = {author last name, author first name, title, date, place, publisher} },

 

Glossary, to allow reader application to see any use glossary terms:

 

glossary = { term = {Name of glossary term}, definition = {freeform definition text}, relates to = {relationship – “other term”},  term = {Name of glossary term number two}, definition = {freeform definition text}, relates to = {relationship – “other term”}, },

 

Special, to allow the authoring application to add anything, which a human programmer or advanced ML can read and optionally use:

 

special = { name = {DynamicView}, node= {nodcname, location, connections} }

 

 

  • Provenance

 

The ‘version’ field is the version of Visible-Meta, the ‘generator’ is what added the Visual-Meta to the document and the ‘source’ is where the data comes from, particularly to be used if appended to a legacy document:

 

visible-meta = { version = {1.1}, generator = {Liquid | Author 4.6}, source = {Scholarcy, 2019,08,01} }

 

Sample Visual-Meta

 

Co-designed with Jakob Voß at github.com/nichtich/visual-meta

 

@{visual-meta-start}

 

@visual-meta{

version = {1.0},

generator = {Reader (Release) 1.3 (1)}, }

 

@article{Engelbart1962,

author = {Douglas Carl Engelbart},

title = {AUGMENTING HUMAN INTELLECT – A Conceptual Framework},

month = jul, year = {1962}, institution = {SR1},

document = {augmentinghu_douglas_engelbart_19621021231532_6396.pdf},

formatting = { heading level 1 = {Helvetica, 22pt, bold}, heading level 2 = {Helvetica, 18, bold}, body = {Times, 12pt}, image captions = {‘Times, l4, italic, align centre} },

citations = { inline = {superscript number}, section name = {References}, section format = {author last name, author first name, title, date, place, publisher} },

glossary = { term = {Name of glossary term}, definition = {freeform definition text}, relates to = {relationship – “other term”},  term = {Name of glossary term number two}, definition = {freeform definition text}, relates to = {relationship – “other term”}, },

special = { name = {DynamicView}, node= {nodcname, location, connections} },

}

 

@{visual-meta-end}

 

Please note, the '@{visual-meta-end}' is crucial to have as the last element in the document since it is recommended to parse the document in reverse and have the software look for this element to confirm that visual-meta is present.

 

Extensions

 

Extensions can include time information, location information and person identifying information, where the key is to specify which method/system is being used for the data. These can be used as payload to a citation or as references in their own right. The user would assign these values in a similar interaction to how they would assign a URL to text.

 

The first part shows what text in the document is referred to, followed by the date and finished with a description of what type of data it is:

 

"In text Reference"  "Data"  "Type of Data/location of description"

<json>

[ {"name":"8:23am, Tuesday, 13th of May 2020", "2,208,988,800":"typeNTP"},

{"name":"14th & Madison, NY", "Latitude: 40 degrees, 42 minutes, 51 seconds N":"latlong"},

{"name":"David Millard", "0000-0002-7512-2710, "person":"https://orcid.org"},

] </json>

 

 

Please keep in mind that the goal is to be able to copy and paste data across systems while specifying how it is defined and formatted, as shown in brackets above. This is about self-declaring data (to a human or translation code) visible in plain sight, it is about allowing users to copy and paste self-defining JSON data.

 

These can be data pods, they do not necessarily need to be part of a document. As for the specifics, that is purely a matter of implementation.

 

Picture the scenarios: You read an ordinary PDF and you come across the time an event happened and you can click on that time and a menu of options are presented (depending on what reading software you are using), including showing exactly how long ago it was in the past (or how near in the future) and lets you copy the time and use it when you come across another time event where you can now automatically see how far apart they are in time.

Picture the same with geographical information; you can copy and paste locations and use them semantically with other locations.

Imagine coming across the names of people and having a solid link to their online presence and not having to guess who is really who.

 

And much, much more–this addressability creates the opportunity for rich, useful interactions.

 

Further Information

 

Further description is on the blog: wordpress.liquid.info/visual-meta and further information at: Visible-Meta Example & Structure. Full source code for parsing visual-meta will be made available here. Addressing is discussed at wordpress.liquid.info/10/scholarly-copy-addressing-clipboard/frode/

 

 

JSON Addition

 

JSON can be used to augment the way headings are recorded for a more robust result, as used in The Future of Text book:

 

<json>

[ {"name":"Acknowledgements", "level":"level1"},

{"name":"Contents", "level":"level1"},

{"name":"the future of text : DEAR READER", "level":"level1"}, {"name":"Dear Reader of The Distant Future", "level":"level2"}, {"name":"Dear Reader of Today", "level":"level2"},

{"name":"the future of text : introduction", "level":"level1"}, {"name":"HOW TO READ THIS BOOK IN READER", "level":"level2"}, {"name":"on symbols", "level":"level2"},

{"name":"Welcome", "level":"level2"},

{"name":"linearising", "level":"level2"},

{"name":"Arming the Citizenry: Deeper Literacies", "level":"level2"}, {"name":"Myriad Texts", "level":"level2"},

{"name":"Addressing", "level":"level2"},

{"name":"Point", "level":"level3"},

{"name":"Cite", "level":"level3"},

{"name":"Link", "level":"level3"},

{"name":"Analogue / Real World Addressing", "level":"level2"}, {"name":"Digital Addressing", "level":"level2"},

{"name":"Evolution", "level":"level2"},

{"name":"the future of text : Articles", "level":"level1"}, {"name":"Adam Cheyer", "level":"level2"},

{"name":"Adam Kampff", "level":"level2"},

}] </json>

 

 

JSON can also potentially be used to encode the entire document to enable advanced functions like complete reformatting of the document to suit the reader. Since the visual-meta can be very, very small, this does not have to impact the document page number significantly.

 

 

‘This is a very important concept’
Vint Cerf