Definitionary and the Language Independent Document

"Definitionary" is a term I invented for an inverse dictionary. The entries are based on the definitions, not the terms. Also, the Definitionary is multilingual in that all languages share a global set of definitions, and equivalent terms link to the same definition. Due to the cross-language aspect of the Definitionary, it is possible to create a simple cross-language electronic document based on definitions (including unique concepts). Normal documents are composed of words. This definition-based Language Independent Document (LID) will have simple, almost poetic sentences. Technically, each definition is assigned a unique numeric key. Software is necessary to encode and decode the strings of definitions. Language grammar goes out the window, replaced by a small set of standard syntaxes. Encoding software allows the user to choose one or more definitions for each word entered. When decoded, the document is rendered into the most popular word for each definition. The key is that decoded documents are rendered in the reader's language of choice, regardless of what language was used when composing the document.

The LID has great value in the business world. For large, modern countries LID allows day-to-day business transactions between Europe, Asia, and the Americas without resorting to English. For businesses in small countries struggling to join the global economy, LID could be the key enabling technology where English is not common and for businesses too small to employ translators. The Definitionary is very much an inclusive technology.

A simple example is "Tom likes bugs". Encoded with the numeric definitions for "Tom (proper name)" "enjoys and appreciates" "Volkswagen Beetles". This renders into Spanish as "Tom le gustan los vochos." The key in this example is that the English writer selected the meaning "VW Beetle" as the meaning of "bug", and not the more common "crawling invertebrate". As a side note, Google Translate returns "Tom le gusta un error."

Web pages created in the LID could easily be rendered into the local language by the browser (or a plugin, or even a server-based tool). This enables all LID web pages to be universally accessible to a global audience. The people and cultures that have been somewhat marginalized by the (largely) English language World Wide Web will have a global audience for their pages.

Since the Definitionary uses concepts, unique cultural aspects are preserved. People won't be forced to express their ideas in English. When there isn't a specific word for a concept in the reader's language, a phrase is substituted. Blended meanings and double entendres are supported via a mechanism allowing multiple definitions for a word-position in the LID. Thus the LID supports subtle meaning, and it also supports humor.

I already have crude working prototypes of the Definitionary and the LID. The big hurdle is to create a small number of syntaxes for various writers to choose from. Ideally, there would be a syntax not too different from the speaker's normal written grammar. The syntax would allow the writer to explicitly specific linguistic linkages such as subject-verb order. I estimate that it would take about 30 minutes to learn the syntax. Short sentences will be preferred, leading to Haiku-like or poetic documents.

Of course, every language has unique concepts. This is good. First, it highlights cultural differences, which the Definitionary both preserves and illuminates. Second, it is easy for the Definitionary creators to invent a phrase for each language that conveys the meaning of the unique definition from another culture. If the reader needs the exact concept, then they must read the full definition, just like a normal dictionary. The LID makes this simple. In the current implementation, I use a "mouse-over" to pop up the definition.

You can read a LID in any language present in the Definitionary, regardless in which language the document was originally composed. This has boundless possibilities to transform humanity. Not only can people communicate without becoming multilingual, but the Definitionary preserves all the unique concepts of each language.

Creating the Definitionary is going to be time consuming. However, it is helped by the many open source dictionaries currently being created. Also, daily conversation uses as few as 5,000 words. Naturally, technical terms are required, but once again, the number of terms is reasonable, and terms can be added over time.

Grammar of the LID is an interesting problem. Based on linguistic research in the area of universal concepts in language, it is possible to create simplified grammatical encodings that work for everyone. My approach relies on a formal syntactical convention used in composing a LID. Rule-based decoding for each language renders the LID into sensible output. Guides will specify what kinds of compromises have been made for each language, and I'm guessing that someone can read through a guide to their language in a few minutes. For many Indo-European languages the problem will be small due to their overall similarities.

I'm currently adding American English, and Spanish to the Definitionary. I plan to encode my Volkswagen Bug web site (which is quite popular) into LID web pages. At that point, the site becomes available in both English and Spanish. Due to the limited topic, my Volkswagen Bug site requires fewer than 750 definitions.

Although I plan to open source the Definitionary, it makes sense to get the project onto a firm business footing. Conceptually, the business model is based on revenue generated from training, consulting (primarily, document repository indexing), cultural guides, etc. The revenue stream keeps the project going, and is used to fund development of additional languages. It may be possible to get government and/or business funding for some popular languages. However, I expect that the Definitionary project will have to find internal resources for smaller languages.

If you have an interest in the Definitionary and the Language Independent Document, please contact me.

Contact Tom

Defindit Open Source  |  InfoGizmo Sites