Tim Berners Lee’s vision of the Semantic Web (Web 3.0) as defined in his 2001 paper with Hendler and Lassila though accurate is incomplete. The evolution of the web is not going to stop at its cognitive stage but will move towards becoming an ultra-smart agent and for that to happen the web needs more than knowledge navigation systems, it needs knowledge assimilation systems. The basic form of knowledge assimilation is a conversation. According to Kingsley Zipf, the linguist who studied the statistical behavior of different languages in 1932 in his work ‘Selected Studies of the Principle of Relative Frequency in Language’, conversation could be explained mathematically. Zipf laid down the behavior of language but he did not articulate the mathematics of language. The missing mathematics could not only resolve the semantic puzzle regarding the nexus between automation and human language, it could also allow machines to be taught language just like a child learning to speak from a parent. The semantic mathematics could potentially teach machines to think, learn and converse. These conversational machines could cluster the web and together create the ‘Web Singularity’, a smart, intelligent web which could obviate the need for supercomputers at every home as it will rely on knowledge assimilation and consistent semantic learning.
Big Data – Small Problems
Pre internet era was about physical books and libraries, information was premium and access to data was for the elite. This is why data had a mysticism about it. Though ‘Big Data’ is nearing 70 years  as an idea, the world still faces problems linked to poverty, diseases, conflicts, population, energy, economics and climate change. One could conclude that the big data can offer us a window to a more objective world, but it is oblivious to what makes the data tick and what is knowledge. We ostensibly have made science fiction of the 1987 a reality (take Star Trek IV: The Voyage Home as an example, when Scott assumed talking computers), but it took Siri nearly 20 years to appear on our phones.
The semantic web also referred to as Web 3.0 was expected to be a web of a data that can be processed by machines, hence allowing a faster and more optimal search. The expectation was a meaningful manipulation, a language through which machines could make databases talk. The databases still don’t talk, but some machine reading has already started happening between inter and intra-domain databases. It may still take a while before computers can talk.
Tagging data is assumed to be a pre-step to listening computers. First comes the tagging, then the reading, relating and the listening. After that comes “I don’t understand”, but that’s all fine if the user is patient and willing to give feedback to the computer about where it is wrong. Something like a parent teaching a child. This is not how we perceive technology today and this is not how Tim Berners-Lee imagined the semantic web process, which is more about knowledge navigation than knowledge machines, more about searching for knowledge than about assimilation of knowledge.
Even if we assume the stages of Lee’s vision about semantic web were sequentially correct, and the industry starts adopting technologies for tagging data, we might still lose some of the older, untagged information. A lot of information might just stay untagged and hence unsearched. The problem can only be addressed if the intelligent web tags itself as it adapts to old and new information. This might seem like science fiction, but if we aspire an intelligent web, tagging is the least of its impediments.
The Semantic Puzzle
One of the surprises I had in 2014 was the number of data sharing companies presenting at the Web Summit. The speed with which we are moving to the cloud and overcoming resistance to data sharing suggests that we are moving in the right direction. Adding logical inference to the interconnected databases brings us closer to what Lee, Hendler, and Lassila envisaged  and wrote about in Scientific American in 2001, but this is not all that the semantic web can accomplish. Semantic web is the step before the intelligent web. This is why any set of assumptions play a cardinal role in the future of the web. If by chance the assumptions are wrong, the intelligent web may miss us by a generation and we may still be in the dark tunnel of data, without the knowledge of lighting the proverbial firewood.
Lee et. al said that “The human language thrives when using the same term to mean somewhat different things, but automation does not.“ The assumption that automation cannot flourish where human language prospers is the semantic puzzle that could circumscribe our vision.
Agreed that automation has limitation today in terms of learning how humans use the language, but it’s the ambition of learning this subtlety which is the real hurdle for the creation of the intelligent web. The final objective of the web is to have a conversation with the user. If computers can listen they should also be able to have a conversation, a thinking web. Semantics can not be simply visualized as a half suspended bridge to knowledge navigation. Any semantics bridge we build should allow the humans to tap into the knowledge of their own network, a real journey, with an intelligent guide.
Semantics is the answer for the machines to understand the human language and even become better. Only then can we discuss mathematical conjectures, puzzles and ideas of complexity with a machine, which is smarter than us.
We may have unintentionally slowed down this process (previous assumptions) or may wish to slow down the process intentionally, fearing the uncertainty accompanying a thinking web, but eventually, the ‘Web Singularity’ is a reality, which will obviate the need for supercomputers at every home. The intelligent agent will be for everyone with an internet connection.
Mathematical History of Language
The answer to the semantic puzzle can be found in the mathematical history of language. In 1916 the French stenographer J.B. Estoup  noted that frequency (f) and rank (r) were related by a “hyperbolic” law which stated that the multiplication of rank and frequency (rf) was approximately constant. This meant that human use of language was distributed mathematically. American linguist George Kingsley Zipf (1902– 1950) confirmed  that the hyperbolic rank-frequency relationship appeared to be a general empirical law, valid for any comprehensive text and with a surprisingly high accuracy. This is Zipf’s law.
In their paper “Zipf’s law, hyperbolic distributions and entropy loss” , Harremoes and Topsøe suggest how Zipf argued that language development was about a vocabulary balance which was driven by two opposing forces, the forces of reversion (unification) and diversion (diversification). The force of reversion tended to reduce the vocabulary and corresponds to a principle of least effort made by the speaker while the force of diversion had the opposite effect and was linked with the listener (auditor). Zipf did not transform these ideas into a mathematical model, his basic consideration was that conversation (knowledge) was as a two-person game, a speaker, and a listener.
The Intelligent Agent
Assuming that the missing mathematics of Zipf is available today, machines could be taught language just like a child learning to speak from a parent. The child adopts the path of least effort, while the users make an effort to impart meaning to the speech. This is unlike what is happening today, as the conversational learning is for the knowledge navigators and not for machines which assimilate knowledge. Conversation is knowledge. This is a dramatic change from machines reading information to machines assimilating knowledge as an intelligent agent.
Once the machines start understanding, leveraging on their ability to read a cross section of tagged databases and their ability to comprehend the subtlety of human language, the machines will be able to assimilate knowledge and become intelligent. These machines will crawl the web and look for solutions to complex problems, as they become more and more intelligent. This will not be artificial but pure intelligence. This is already happening as supercomputers assimilate information, but this is centralized learning. Knowledge assimilation will eventually move to the web, as the mass of cross-sectional domain data is on the web.
Singularity is about knowledge assimilation and web singularity is about the assimilation of knowledge on the web. The latter being exponentially larger in scope than the former. As scary as it may sound, the decisions are still human and web agent remains a decision support system. The internet is used for the good and the bad. The intelligent web will be no different, but the pace of science will accelerate as knowledge assimilation systems start assisting in developing solutions for bigger world problems and maybe even beyond this world.
 Fremont Rider. (1944).”The Scholar and the Future of the Research Library. A Problem and Its Solution”.
 Lee T. B., Hendler J. and Lassila O. (2001). “The Semantic Web”.
 Estoup, J. B. (1916). “Gammes Stenographiques. Institut Stenographique de France”.
 Zipf, G.K. (1932). “Selected Studies of the Principle of Relative Frequency in Language”.
 Harremoes, P. and Topsøe, F. (2005). “Zipf’s law, hyperbolic distributions and entropy loss”, Electronic Notes in Discrete Mathematics.