From Individuals to the Global Community: the Hidden Contradiction of the Cultural History of Classification and the latest browsing technologies

László Z.Karvalics
Information Society Research Institute
H-1111 Budapest, Mûegyetem rkp.9.


We would like to show that the collective production of knowledge and the individual creation of its organizational casting mould, which is therefore incapable of following the growth of the amount of information and its “inextricable intertwining”, will always be in irreconcilable contradiction. The solution seems to be coming from the Web: the latest browsing technologies are able to “compress” the wisdom of millions of users. The real paradigm shift, however, can be expected neither from technology nor the user, but the specific hybrid systems of these. Let us focus our attention to Human Agents.

Categories and Subject Descriptors

H.3. Information Storage and retrieval
H. 3.2. Information storage, Record classification
H.3.3. Retrieval models
H.3.5. Web-based services

General Terms
Documentation, Human Factors, Theory,

Classification, Knowledge, Browsing



    William Charles Berwick Sayers’ 1926 handbook compares classification to a map that provides “just as comprehensive and clear picture of knowledge as the cartographer does of given geographical areas”. [ 1]

    When displaying the relationships between certain fields of knowledge and emphasizing certain features, the expert of classification uses the cartographic method; however, the continuous movement of knowledge and the unceasing transformation of specific fields, when compared with the unchanging geographical conditions, make it almost impossible to reassuringly carry out the work of organization.



    It is as if we were in an epistemological-historical exhibition showcasing the renewing experiments of classic figures in classification theory to make order in the World. It is hard to escape from the idea that if we try to approach the three- dimensional Universe of knowledge through the two dimensions of the map, we will always be in a chain. There is a hitch here, which is not immediately apparent behind the veneer of transitional solutions, half-successes and functional systems: however brilliant alternative classification systems are, they will again and again set about elaboration by using the same method despite the unstoppable growth of storable and retrievable information. Someone devises a system. If he or she is a mastermind possessing appropriate encyclopedic knowledge and an innovative tendency capable of blending invention with extraordinary diligence, he or she has a good chance that just his or her system will provide the vital and refreshing organizational and retrieval methodology for a library environment longing for a solution. However, the collective production of knowledge and the individual creation of its organizational casting mould, which is therefore incapable of following the growth of the amount of information and its “inextricable intertwining”, will always be in irreconcilable contradiction.

    Dewey’s system for example had its impact from the “algorithm” side. As for its content, it was overabundant in mistakes and prepossessions even at the very moment it emerged not squaring even the scientific knowledge of the day. It was America-centered, and it arbitrarily divided fields that functionally belonged together – still it could serve libraries as a simple, applicable and expandable system. The development of the prevailing classification systems had for a long period to be seen as wonderful individual artworks. The history of classification (according to Ranganathan) was written by “flaming geniuses”. The individual systems typically did not emerge from one another accumulating the epistemological capacity of the “geniuses”, but attempted to surpass the earlier methods.

    What has been “frozen” into the classification systems? The creator genius’ (individual) logic and conception of the universe, and the common sense of the librarians (individuals) performing the classification. What was the implication of all that? Even the most effective classification system was “merely” capable of bringing a durably functioning system to a specific interest group in a given institutional environment, presenting more and more barriers to any demand beyond that scope. This “merely” is of course an act of extraordinary cultural historical significance: a technology enabling the revolution of the extensive diffusion of acquiring knowledge. Accompanied by other succeeding technologies, it deserves a unique place in museums and in those libraries that play a similar role up to the present time.

    The insufficiency of UDC and the new generation methods (co-coordinated indexing, then especially thesaurus building) adjusted new methods not only to the new functions, but put the specialists of particular fields of knowledge in place of the superficially universal Individual forcing Reality into casting moulds. The competent Individuals then precisely and seamlessly projected the knowledge of a niche special field, successfully coping with the emerging terminological and retrieval challenges.

    The limits of the cognitive capacity of the librarians processing the new items could in the meantime be extended by creating a comprehensive space of division of labor among the “Individuals”. When Vickery supplements the “five Ranganathan laws” of library science with a sixth one saying “no library may exist in isolation”, he reflects the problem solving force of the network principle [ 2] .

    Yet, in the meantime, the World has rushed past libraries, too. Is it by accident that the ideas of “information explosion”, and especially “information pollution”, and “drowning into the sea of information” emerged just from the confused minds of library theoreticians at the beginning of the sixties? And where are the sixties from the millennium-scale of knowledge growth? If we can trust what astronomers say, the three-dimensional space-mapping technology alone is expected to increase our knowledge accumulated so far on the cosmos hundredfold in the coming few years.

    Is there any method that is capable of channeling the bewilderingly increasing flow of classifiable and retrievable information? Are there any procedures to provide an opportunity to bridge the contradiction between the communal feature of knowledge production and the individual feature of organization in circumstances harder then ever before?



    Yes, indeed – some Web-technologists proud of their achievements claim. We have new technologies allowing millions of Individuals to be present in the space of knowledge production and processing directly linked to use. Instead of geniuses and librarians, it is the animate vehicles of knowledge themselves who, through the utilization of their experience and activity, "preshape” the awesome volume of knowledge raw material for others. Do you know the way the leading Internet search engine, Google operates? It ranks hits based on the number of sites linked to the found pages as a reference – which means that its searching algorithm depends on how keen users of the information evaluated the given source. Do you know that the best way to quickly approach the professional literature of a hot issue now is to visit the huge online bookstore called Amazon, because from the topic-combining habits of earlier users to simultaneous searches on authors, from the orienting power of reader reviews on a given book to a list of works dealing with the similar subject in the given field you can get a navigation support that is based on the utilization of the evaluating and interpreting capacity of a virtual professional/interest community involuntarily assisting your searches? Just like water- or wind-power stations that transform an ever existing natural force to source of energy... Technology helped us to reach a point where instead of isolation and mediation serving as the basis of development for thousands of years the new horizon of emancipation and direct knowledge transfer is preparing to unfold.

    From this angle, even the first and second generation Internet search engines can be viewed as museum pieces: the once dominant, but short-witted AltaVista and its successors were impetuously replaced by Google. And without a shadow of a doubt, Google will also soon be fading away as technologies attaching the effective auxiliary troops of automation developed for the given problem solving level to the ’Individuals’ channeled body of knowledge are implemented. The horizon seems to be impressive: intelligent agents and powerful new search and filter engines, under the umbrella of the “Semantic Web”, which are often even without classification operations, through mechanic keyword searches - open up unprecedented new prospects for engineers and technologists of information. Reality, however, suggests something slightly different as if, despite all the promises, there is a limit these technologies with their current logic would never be able to break.


    Technology took an offensive from two directions in order to make individual knowledge communal and to make common knowledge widely accessible.

    Collaborative filtering utilizes the “imprints” of completed online activities. It extracts knowledge we would not have been able to get access to being short of other tools. “Recommendation”, the result of collaborative filtering, provides relevant input for all those who plan knowledge processes or business actions in the given space of activities.

    That is accidental knowledge, however, and it becomes clear if we start to examine the rating-type metadata from the number of seconds spent on a given section of the screen to eye-movement tracking. That mapping of the behavior is a sort of “micro context”, the results of which enable us only to manage, indirectly, elementary knowledge bits (opinions, choices, preferences, inter-elemental connections).

    More complex domains of knowledge may be extracted from the texts. We attempt to perform the same operations on texts perceived as sort of records or imprints of the patterns of the human mind as on the “imprints” of our behavior. We try to get to the macro cosmos of meaning through the micro context of words (or more precisely imagery). Hardworking metadata generators, however, can only regenerate for us what they can “see”. It is therefore evident for experts dealing with the issue that the most important thing for new generation solutions assigned to the wider concept of the “Semantic Web” will be to build the ontology lying behind the processing elements. How shall we arrange or cluster the concepts to be able to comprehend them and the correlations connecting them – and, finally, to comprehend Reality we aim to grasp through concepts? Now, we are again where we started from: classification, the conceptual imprinting of the world.

    And beyond. The Web has long attracted such activities that faced completely analogous challenges to classification at some point in serving the knowledge flow. The overview of these activities will help us comprehend where we can expect the solution for advancement to come from.



    “Knowledge can be understood as interpreted content, available to a member of a community and always shaped by a particular context. Digital representations of content and context become e-knowledge through the dynamics of human engagement with them. The digital elements of e-knowledge can be codified, combined, repurposed, and exchanged … e-Knowledge is changing the traditional value chain to a value net … The e-Knowledge Industry may democratize the provision and use of knowledge, reshape power centers, recalibrate the economics of publishing and enable new roles”.

    Using the brief definitions of the latest book of Norris, Mason and Lefrere [ 3] , instead of the classification of different operations performed on knowledge, those channels will be the most important through which information will flow in order to meet certain functions. Norris et al. view the (vertical) systems of Today from the institutions behind the channels, while they characterize (horizontal) Tomorrow with public access to universal knowledge toolkits (services, infrastructure, solutions, repositories).


    Table 1.

    Paradigm shift in e-content channels [ 4]

    Today’s Vertical Channels for e-Content

    Components of Tomorrow’s Horizontal Channels For e-Knowledge

    Traditional Publishers and Direct-to-Digital Publishers — traditional publishers like Harcourt Brace, Pearson, Thomson and new direct-to-digital publishing enterprises

    Content/Context Repositories — discipline- and institution-specific repositories, plus marketplaces that aggregate content repositories into a meta-marketplace

    Course and Learning Management Systems — course materials held by WebCT, Blackboard, Click2learn, Outstart, and other applications

    Content Creation Tools — tools for creating and managing content/context through Learning and Content Management Systems (LACMS)

    Universities and Colleges—university presses plus faculty course materials

    Value-Added Content Services — additional services that enhance the value of content and codified context in learning objects

    Professional Societies and Associations—trade publications plus tradecraft-rich bodies of knowledge

    Exchange Infrastructure — the marketplace exchange service that enables metering, repurposing, combining of content by demand aggregators, and direct users

    The above division still not provides all the answers. What we are interested in is the various ways collectively acquirable contents may be created as a result of individual activities. And what leads to that? The recognition that education planning and knowledge repository building represents a completely analogous problem environment with classification systems serving as the basis of library searches (and also hidden in search engines). As education planning and knowledge repository building are increasingly becoming web-based, their common nature – in the form of the Learning Objects of online instructional materials and the Web-based knowledge repository entries – becomes increasingly apparent.

    A theoretically clarified science taxonomy structure shall stand behind the construction of classification and the correlation net of the Learning Objects (LO). In the system of course materials “distilled” from that, elementary LOs would be determined by the standards of the conventionally structured individual sub-fields – such standards, however, are rare as the “deep structure” of our knowledge on the world is being transformed in an extremely rapid pace.

    Encyclopedias - the imprints of “traditional knowledge” - are therefore suitable to be the fossilized course books of the Past, but the genre itself will no longer be able to meet the demand of today’s constantly transforming dynamic knowledge flows.

    The producers, upgraders, and supervisors of the Learning Objects, and the participants of the increasingly popular network movements which bring the mentality of the open software movement into encyclopedia production, all do the same thing: they all transform their own, personal knowledge to communal knowledge.

    Table 2 intends to show that we can go along the individual-collective track via various vehicles. Using the “reflectivity” of knowledge production as a reference point, the new knowledge-condition can be achieved:

    - in an emergent way (where producers of the unique pieces of knowledge are not aware that a more comprehensive Whole emerges form their partial results)

    - by simple mediation, where knowledge emerges from “movement”

    - by collection, through which information repositories of later knowledge flows emerge.

    Currently, these activities are present on the Web in organized, direct ways, and in spontaneous, self-organizing, indirect ways. Technology is embedded in this communal space, and the demands of unique special communities call systems with increasingly powerful solutions into life.

    Directions of E-knowledge: the way individual becomes collective

    Emergent process

    Information and knowledge mediation

    Knowledge repository building




    Collaborative problem solving with separate modules

    Edited, systematic news-letters

    Online workgroups

    Learning object develop-mint


    item-to-item collaborative filtering





    Open-source content develop-ment


    Previously, we claimed that the technological innovation of knowledge processes from the side of Web-based technology has almost come to a deadlock, or at least it is close to its “natural” limits.

    But before we would formulate our hypothesis about the possible scenario of advancement, it seems reasonable to subject cultural history to inquiry. Is it possible to apply the experiences gained from earlier periods when, faced with similar epistemological challenges, we had to find new solutions?



    Learning a professional skill is often based on social interaction, and competent use of appropriate technologies. Schön [ 7]

    Looking at the cultural history of knowledge technology, we can recognize a peculiar rhythm. Subsequent to the innovation phase, but prior to the diffusion of wide-scale usage, experts specialized in particular tasks were always needed in order to make a solution functionally “operate”. We had already had writing technology, but it was still the duty of clerks to serve the needs of the illiterate. We already had the typewriter, but we needed typists to have carbon-copied typescripts in precise layout. We already had the computer, but mathematicians (accountants, pay-roll clerks, etc.) needed technicians to carry out computational performance. (As even today many executives still make their assistants print company e-mails and dictate the answer).

    Or going back to the world of libraries: we had already had the book, but usually it was made usable with the help of librarians carrying out retrievals using alphabetic booklists. Plentiful relevant pieces of literature were accumulated, which were made available to the researcher of a special professional field by bibliographers and Human Agents revealing the sources. When the special terminology of a science or a special professional field was spreading, thesaurus-building Human Agents made them easy to survey. Catalogue trays had already been available for library searches, but it was faster and more reliable to ask the librarian Human Agent to help.

    What happens when the use of technology diffuses? What happens if everyone can write, typewrite (use the keyboard), when the user interface makes the mediation of the technician unnecessary (the technician is “integrated in the program as an algorithm”), when advanced search engines immediately provide the found items?

    The mechanization of brainwork has leaped a level forward liberating the former Human Agents from performing their dull mechanical exercises. On the other hand, it also paved the way to enable new types of Human Agents producing higher levels of value-adding to start to assist mental activities. (See more [ 18] .)

    The organization of knowledge proprietors into networks is able to “push” the process of knowledge production forward even without tools. “One of the motivations for launching the ARPANET project in 1960s was the belief that by connecting different computing sites, communities of computer programmers could more efficiently share their programs and knowledge”. (That is why “two of the most influential visionaries of ARPANET, J.C.R. Licklider and Robert Taylor, may have believed in 1968 that such online communities would radically transform computer programming, but also society, work, and human thinking” [ 10] .

    Today, it seems clear that although networked knowledge workers are assisted by numerous, increasingly smarter Software Agents, we would also need Human Agents who could “preshape” the knowledge raw materials by means of the specialized use of the available tools. We assert that radical advancements in search capabilities would be achieved not by the automated new-generation agents, but as a result of developing hybrid systems combining these agents with Human Agents. Advancement therefore can be grasped not as a technological, but as an organizational and business category: when will those viable businesses emerge that can sustain tens of thousands of new-generation Human Agents? Who are those knowledge workers, and what are those professions whose representatives would evolve in that direction?


  8. Technology, social practice, and knowledge complement each other and their evolution is part of the same process. Ilkka Tuomi [ 10]

    Even the current search capability could be considerably increased if specialists specialized in search and information preprocessing assisted the work of those who add value to the previously produced information as a result of the search. What are those specific groups of professions that needed such support? Who are supporting these activities now and what sort of work organization methods do they use?

    Table 3. Current Human Agents and their partners

    Social sub-system


    Agent type

    Work organization



    Junior researcher


    Project contract, outsourcing


    Corporate intelligence

    Environmental scanning


    Industrial Analyst




    Strategy- and decision making


    Hierarchical employment, tender

    According to another approach, the future professional model of the digital sector will be the interface manager, whose job will be to coordinate the different phases of the production chain from the idea to its implementation. Cultural industries and institutions should try to integrate these new profiles into their organizational structures.

    The following list from a European Council document [ 12] contains those specialists who will be the most important customers of the next-generation Human Agents.

    Table 4. New interface manager’s profiles [ 12]

    Content and technology profiles

    Creative content providers
    Computer Based Training (CBT) authors
    Multimedia authors
    Multimedia developers/storyboarders
    Content co-coordinators
    Editors of off-line (CD-ROM) and on-line products

    Design and technology profiles

    Screen and multimedia designers
    Computer animation designers
    Media designers for picture and sound
    Film and video editors
    Software developers

    Management and technology profiles

    Multimedia project managers
    Executive producers
    Legal experts for multimedia products
    Systems analysts
    Information economists

    Distribution and technology profiles

    Marketing managers of ICT systems
    Information brokers/multimedia booksellers.
    Information brokers/multimedia librarians
    Archivists of electronic products
    Documentalists of electronic products

    The delivery of search/prefiltering services used by interface managers is the role of a technologist: involving a relatively easily acquirable scope of duties by utilizing the current capabilities and tools in an appropriate instructional environment (even through self-instruction).

    The professional profiles we are the most interested in, however, imply the role of an engineer (more precisely: a knowledge engineer). Behind the increasing number of knowledge technologists there will also be groups of specialists that will be specialized in the development of models, methods and solutions available for Human Agents.

    The professions listed in Table 5 currently do not (or just partly or occasionally) exist. Nevertheless, we are sure that some of them will soon become much-sought players of the labor market.

    Table 5. Future specialists behind the next generation Human Agent


    proposed name

    Scope of duties

    From which current profession will it recruit its representatives ?

    Ontology- and meta-data making methodology developer

    General solutions, algorithms for all ontology- and metadata-making processes

    Thesaurus maker,


    Inter-disciplinary monitoring expert

    Makes reports on the creative intersections of the new achievements of distant fields of knowledge


    Data Set Visualizer

    Makes data sets more easily searchable and transparent through visualization

    Information architect

    Pattern-analogy searcher

    Searches for new correlations in statistics

    Environmental Scanning specialist

    Cognitive frame- re-engineers

    Reinvents fossilized cognitive patterns (language, science taxonomical disciplines, subjects, evaluation system, terminus) that often hinder searches


    At the same time, there is a specialist-type that also exists, yet it is regarded as a unique media player, a sort of network freak. In fact, we can detect the emergence of a new type of Human Agent in the act if we observe them: the hardworking tradesmen of blogs (Weblogs), the increasingly popular personal and opinion journals.



    “… all meaningful human activity is inherently social”

    Y. Engeström, cited by Tuomi [ 10]

    Technology usually develops through incremental improvement within an existing community of practitioners, but sometimes the community faces problems that require radical innovation

    (Constant) [ 9]

    According to Gillmor [ 6] , the “Blogosphere” is a universe of Weblogs, “often pointing to other weblogs in an ecosystem of news, opinions and ideas”. They are “frequently updated, with items appearing in reverse chronological order. Typically they include links to other pages on the Internet, and the topics range from technology to politics to just about anything you can name. Many weblogs invite feedback through discussion postings”. “In some areas, like tech reporting, the Web logs have largely replaced the professionals” [ 17] . Yet, the world of blogs has so far been dealt with and analyzed mainly as a category of online publishing and media, while its role as a knowledge producer and mediator has largely been ignored.

    Then, in the beginning of 2003, Google has purchased Pyra Labs, which created some of the earliest technology for writing weblogs, which suddenly brought bloggers as Human Agents into focus. As Gillmor [ 6] says, “More than most Web companies, Google has grasped the distributed nature of the online world, and has seen that the real power of cyberspace is in what we create collectively. We are beginning to see that power brought to bear”. 200,000 of the 1.1 million registered users of Pyra's Blogger software are actively running weblogs. It is not by chance that many experts claim that the newly developed “Google News gauges the collective thoughts of more than 4,000 news sites on the Net”. [ 6]

    But what would a search company want with a tool for making weblogs? As Kahney [ 11] and his expert respondent, Cleveland said, “Google's acquisition of Pyra would help Google create a more accurate search engine by adding rich new sources of data gleaned from weblogs”. The real consideration therefore is “the scores of links webloggers create every day to content on the Web. Weblogs are a rich source of links, which are posted in a fast, timely manner. The technology could allow Web surfers to find not just breaking news stories, but those highly ranked by the weblogging community. In addition, these stories could be accompanied by the best comments made by popular webloggers, or by those writing in a certain language or from a particular country”.

    We must recognize that blogs make the 1945 dream of Vannevar Bush almost perfectly come true. “There is a new profession of trail blazers, those who find delight in the task of establishing useful trails through the enormous mass of the common record. .... This is the essential feature of the memex. The process of tying two items together is the important thing”.

    Bush, the author of the influential work ‘As we may think’ [ 13] is perfectly aware that “the repetitive processes of thought are not confined, however, to matters of arithmetic and statistics. In fact, every time one combines and records facts in accordance with established logical processes, the creative aspect of thinking is concerned only with the selection of the data and the process to be employed and the manipulation thereafter is repetitive in nature and hence a fit matter to be relegated to the machine”.

    And there is no time to stop: the intelligence accumulation of Human Agents is improving with new capabilities. The technology of "moblogging" (short for mobile blogging) has also emerged: “along with audioblogging, plain-text blogging is undergoing a subtle transformation as people begin to use their cell phones and other mobile devices to send written updates to their Weblogs”. [ 16] It will be interesting to see who else will join these trail blazer bloggers in the next few years?



[ 1] Sayers, W.C.B.: The nature and purpose of classification. In: A manual of classification for librarians (completely revised and partly re-written by Arthur Maltby 4th ed. London, Andre Deutsch, 1967 pp. 25-32. (A Grafton book)

[ 2] Ungvary, R. - Orban, E. (eds): Osztályozás és információkeresés: kommentált szöveggyûjtemény (Classification and information seeking: textbook with comments) Vol. I. 2001 p. 64. OSZK, Budapest.

[ 3] Norris, Donald – Mason, Jon - Lefrere, Paul: Transforming e-Knowledge - A Revolution in the Sharing of Knowledge. Society for College and University Planning Ann Arbor, Michigan, 2003 See:

[ 4] The Source of table Table 1.: (Norris, Mason, Lefrere, 2003)

[ 5] Linden, Greg - Smith, Brent – York, Jeremy: Recommendations: Item-to-Item Collaborative Filtering Internet Computing 2003 January/ February Issue

[ 6] Gillmor, Dan : Google Buys Pyra: Blogging Goes Big-Time Silicon Valley, 2003 February

[ 7] Schön, D.A.: The Reflective Practitioner. New York, Basic Books, 1983

[ 8] Engeström, Yngvar. Learning by Expanding: An Activity Theoretical Approach to Developmental Work Research. Helsinki: Orienta Konsultit. 1987

[ 9] Constant, E.W.. "The Social locus of technological practice: community, system, or organization?" In: W.E. Bijker, T.P. Hughes, and T.J. Pinch (editors). The Social Construction of Technological Systems: New Directions in the Sociology and History of Technology. Cambridge, Mass.: MIT Press, 1987 pp. 223-242.

[ 10] Tuomi, Ilkka: Internet, Innovation, and Open Source: Actors in the Network by Ilkka Tuomi First Monday, volume 6, number 1 (January 2001)

[ 11] Kahney, Leander: Why Did Google Want Blogger? Wired, 2003 February,1282,57754,00.html

[ 12] Cultural work within the Information Society. New professional profiles and competencies for information professionals and knowledge workers operating in cultural industries and institutions. European Council, 1999

[ 13] Bush, Vannevar 1945: As we may think Athlantic The Atlantic Monthly; July, 1945; Volume 176, No. 1; pages 101-108

[ 14]

[ 15]

[ 16] Festa, Paul: Dialing for bloggers CNET Feb. 24.

[ 17] Festa, Paul: Blogging comes to Harvard. Interview with Dave Winer CNET Feb. 25.

[ 18] Kampis, George: The Natural History of Agents In: Gabor, Tatai – Laszlo, Gulyas (Eds.): Agents Everywhere (Proceedings of HUNABC ’98 Springer, Budapest, 1999 pp. 10-24.

[ 19] Straccia, Umberto: Project Overview: EUROGatherer – A Personalized Information Gathering System In: [ 21] p.7.

[ 20] Glance, Natalie – Arregui, Damian – Dardenne, Manfred: Knowledge Pump: Community-Centered Collaborative Filtering In: [ 21] p.83.

[ 21] Filtering and Collaborative Filtering. Fifth DELOS Workshop. Budapest, 10-12 November, 1997. DELOS Working Group Reports pp 1-129.