|
|
 | | From: | David Novak | | Subject: | Information Research FAQ v.4.7 (Part 1/6) | | Date: | 29 Dec 2004 05:28:14 GMT |
|
|
 | Archive-name: internet/info-research-faq/part1 Posting-Frequency: monthly Last-modified: April 2002 URL: http://spireproject.com Copyright: (c) 2001 David Novak Maintainer: David Novak
The Information Research FAQ
100 pages of search techniques, tactics and theory by David Novak of the Spire Project (SpireProject.com)
Welcome. This FAQ addresses information literacy; the skills, tools and theory of information research. Particular attention is paid to the internet as both a reservoir and gateway to information resources.
This FAQ is an element of the Spire Project, the primary free reference for information research and an important source for search assistance. Do visit http://spireproject.com . It is free and compliments this FAQ with links, forms and tools.
This FAQ resides with pictures at http://spireproject.com/faq.htm and as text at http://spireproject.com/faq.txt
*** The Spire Project also includes a 3 hour public seminar titled *** Exceptional Internet Research. This is a fast paced seminar *** supported with a great deal of webbing, reaching to skills and *** research concepts beyond the ground covered on our website and *** this FAQ. http://spireproject.com/seminar.htm has a synopsis. *** I am in Europe, seminaring in Ireland, and Europe though I *** will be returning to the US shortly, and South Australia for *** a seminar this October.
Enjoy, David Novak - david@spireproject.com The Spire Project : SpireProject.com and SpireProject.co.uk
. . Prelude. 1 . . . . . . . Everyday searching has a simple approach. 2 . . . . . . . Searching for specific, quality information demands a more complex approach. 3 . . . . . . . Let's understand how information is arranged on the internet. 4 . . . . . . . Each format (book, article, web, etc...) has unique search tools and resources. 5 . . . . . . . Specific guidance on libraries, discussion groups and other venues. 6 . . . . . . . Review and discuss types of information in specific fields. 7 . . . . . . . Boolean, proximity, field searching, Dewey and patent classification. 8 . . . . . . . Quality depends on source, currency, search process, reliability... 9 . . . . . . . Commercial information industry, libraries and the info-broker. 10 . . . . . . . Information moves and evolves in fascinating ways. 11 . . . . . . . Steps to improve an online search. 12 . . .
Prelude.
Many of us unwittingly digest great amounts of information in the course of a day. Our information needs are more modest and usually repetitive. When we have questions, we reach for a small collection of preferred information sources close at hand with a collection of assessments as to what is credible and trusted.
As a child, these sources include the school library, an encyclopedia and parents. All the sources are trusted.
As an adult, these sources include the state library, the newspaper, bookstores and current magazines. Adults understand truth has become a little more relative, but when the evening news declares presidential hopeful George W Bush is ahead by 3% (on a sample of 707) we slip into thinking he is leading.
There is more to information literacy. It is, after all, a profession. There are tools you know nothing about and techniques you have never heard of. There is a specialized vocabulary just made to confuse you. Research, or rather information research (to distinguish it from lab-coat style research) is so very much more involved.
Yet there is great simplicity to research too. Just under the murky mist of confusing resources rests a solid platform to stand on. In any one field there are just a handful of databases, directories and periodicals to consider. After decades of library and information industry evolution, clearly valuable sources have already floated to the top, monopolizing their respective fields. Most cities have just one or two primary newspapers. Large industries like book publishing have few book databases and a handful of primary book distributors.
Enters the internet: not so much a change of information as a revolution in access to information. Previously you could justify having just a handful of preferred information sources because these were the sources easily available. Today, and the future, is filled with information close at hand. We are dropped into a morass of competing information just waiting to capture our attention, and strain both our capacity to absorb information and our capacity to understand the differences between sources.
A great segment of our community will fall back to tried and true information sources they grew up with: state library, bookstore, local newspaper. The better alternative sources will be ignored for no particular reason. The rush of the information revolution will push past them. They will only hear of changes when their information needs suddenly change - and they are confronted with a vast collection of unfamiliar options, and struggle with understanding what sources they need.
A smaller segment of our community, by virtue of frequently tackling questions best answered with unfamiliar sources, will be driven to understand the information world: to become truly information literate.
There is another story here too. The way our society handles information is undergoing some very fascinating changes. Any predictions for the future should acknowledge the tension and flow of information in our society. Take, for example, the vast surplus of information emerging on the internet, and the convulsions of the commercial information industry in response. Rather than focusing on how information is organized, we can also focus on how information becomes organized. The who, where and why of information, the sociological perspective, adds meaning to the phrase "information revolution".
- - - - - - - - - - - - - - It was another warm day. The young Egyptian boy strode purposely out the gate towards the river. The Nile was low this time of year. Very abundant with fish and bird life. With luck, Shakh would return at sunset with food for the pantry. Mother would be pleased with that.
Shakh knew fishing had changed little over the last hundred years. The walls of his family's ancestral home had just such a scene of his grandfather fishing on the Nile from a small reed boat. The thinly carved relief was complete with spear, fish, ducks and Shakh's grandmother nearby holding lotus flowers.
Shakh stopped by old-man Jacob on his short walk to the bank of the Nile. He liked the old trader. Years ago Jacob had traveled to the Levant and brought back many strange artifacts. Some even came as far a field as the Harrapan people who were said to live beyond Sheba, across the waves, some three years journey away. He especially liked the small black head carved in a style so unlike anything else Shakh had seen.
- - - - - - - - - - - - - -
The Harrapan people lived on the banks of the great Indus river in modern-day Pakistan. A great civilization almost on par with the Sumerians and the more distant Egyptians, very little remains today. They built vast cities of clay brick with rectangular city blocks. They built drains, public toilets and state granaries. They were the first to populate the Indus river valley. (see http://www.harappa.com/indus2/index.html)
Little remains. The Harrapan civilization fell with the arrival of the Aryan race and the intervening millennia treated their past poorly. The arrival of Islam erased much of their history as did the shifting Indus river itself. The British used the bricks from one ancient city in the construction of a great railway. Only today are the archaeological digs once again unearthing the past.
I search for Harrapa on the internet. Nothing special, just type 'Harrapa' into any of the popular search engines and I uncover harrapa.com, a website devoted to some recent information from these digs. Looks good. Pictures of ancient pots. Children's toys. A map to an ancient city.
Of course, Shakh would have known of the Harrapan civilization. While it is uncertain ancient Egyptian ever visited in person, goods and rumors traveled far from trader to trader. Ancient Egyptians, while not accomplished conquerors abroad, did travel and mix with distant peoples.
Shakh lived in a civilization centuries distant from us, yet both you and Shakh know a similar amount about the Harrapan civilization. The intervening years have not made everything clear. Even the information revolution has not changed the facts. Both you and Shakh have just a single source of information about the Harrapan civilization. You have the pictures on harrapa.com and our short excerpt here. Shakh has the old-man's art object to look at, the old-man's myth of a civilization beyond the waves.
This story carves the act of searching in deep relief. Searching is a skill, a trade and to some a profession. It is also just a simple task of finding information - something we do every day, in so many ways, without any of the difficulties we will get into later in this FAQ.
The difficulties only emerge when you want to do something spectacular. Should you wish to know something specific about the Harrapan civilization, or understand something contentious - then we require a greater degree of expertise and experience. The search becomes a challenging adventure in its own right.
- - - - - - - - - - - - - - The Nile was always a slow river but three months out of the year it burst its banks and flooded the fields, bringing life on the banks of the Nile to a complete halt. For these three months Shakh's family would move into the ancestral home in the streets surrounding the great pyramids. It was an old home, centuries old. Well suited to their needs with a storeroom for food, separate rooms for the parents, and an active social life in close proximity to others. In many ways, this was the most exciting time for young Shakh. For the rest of the year he lived in relative isolation in the village by the Nile. For these three months, he lived in a city, bustling with activity, construction and recreation.
Shakh had expected this year to be like the last but his father secured Shakh an important position - he would be in training to become a scribe. Father had grand plans for young Shakh, plans that extended far beyond life as a scribe. What's more, with luck and further prosperity, Shakh's father had the means to secure his further advance.
- - - - - - - - - - - - - -
Much of ancient Egypt is available for us to read off the walls of the many remaining buildings. They were not a literate nation, yet were able to adorn almost everything with writing and pictures. They lived in the most enlightened society of the day. Years later, Egypt would gift the fledgling Hellenic state a full third of their Greek vocabulary.
This is part of the reason for such an interest in travelling to Egypt. It is the visual symbols that inform us and draw us in so deeply. Standing before the great religious statues, we begin to feel how it was to live and work in that day. To run amok as a young student, waiting for the Nile to subside once again.
Yet, there is much more to knowing ancient Egypt than just the monuments and wall reliefs. Years of study has recovered their lost language of hieroglyphs. Years of archaeology has unearthed their daily lives.
History and Archaeology are fine examples of searching in practice. Both fields struggle openly with the bias and uncertainty each new fact brings forth. Malta is a small island off the coast of Sicily, close to Tunisia. Should evidence emerge of ancient Egyptians living on Malta, what does it mean? Was Malta an Egyptian conquest or an occasional station for their fishing fleet?
This uncertainty applies to all information, in all situations. One of the first events for the new regime in Pakistan was to acknowledge that important national statistics, like the national GDP figures, had been fudged to a serious and significant degree. Important national statistics are not intrinsically true because of their source. This is not a problem solely of underdeveloped nations. Rumor suggests that during the height of Singapore's land value bubble their national figures were unreliable too.
Searching is a skill and an attitude. In this FAQ we progressively unfold the way information is found. Initially, let's cover a simple way to find information; a structured approach to an everyday problem. Afterwards, we shall look more closely, and with more complexity, at the world of information.
Searching is Simple. Section 1
Searching is simple. It starts with a question. It ends with an answer. Everything between is searching. Much of it has to do with the tools you use. Select the right tool and you can get to the answer almost by default. Luckily, for any given topic there tends to be just a handful of must-use tools. For more complicated questions, there are usually plenty of people to ask for assistance.
The answers you are seeking will be found in a selection of different formats. In this I mean books, articles, interviews, and more. This is a very convenient concept and forms the foundation to all our work both here and in the Spire Project. Few research tools cover more than a single format; those that do, tend to cover each format poorly. Start a search by selecting the specific format you are seeking. Then, select your preferred search tool from a small collection specific to that format. To get the information, simply follow through and read, search or interview. Everything follows naturally.
Have a Question. Select a Format. Select a Search Tool.
There are just a few formats to consider.
Books . . . . . Dense, factual, comprehensive and a minimum of 6 months to a year old. Articles . . . . . Shorter than books but focused on one topic. News . . . . . Short and shallow. Immediate. Statistics . . . . . Factual. More reliable. Theses . . . . . Very thick. Deeply researched. Esoteric. Webpages . . . . . Immediate, mixed quality, with limited factual support. Interviews . . . . . Immediate, varied quality, partly digested.
Each format has a selection of simple tools to find information. Many of these tools will be on the internet - which may mean easily accessible. A word of caution: try not to confuse search tools that happen to be on the internet with searching internet information. The Amazon.com book catalogue is a search tool useful in locating books. Though on the web, searching Amazon is part of a book search, not a web search. A search of the Reuters newswire is a news search, not a web search, even though Reuters releases current news on the web. Each format should remain distinct in your mind.
Tools to Find Books 1) Some books, particularly classics, are free on the internet through efforts like Project Gutenberg. 2) Libraries allow you to read books. Library catalogues are frequently online. 3) The largest libraries, like the Library of Congress and the British Library, list millions of books in their online catalogues. 4) Most currently available 'in print' books are listed in national Books-in-Print databases. 5) Each country maintains a special government publication database. 6) Lastly, online bookstore catalogues like that of Barnes & Noble, list a sizeable portion of current in-print books.
Tools to Find Webpages 1) Global search engines index hundreds of millions of webpages for free text searching. Consider Altavista and All-the-Web. 2) Global directories list resources by category. Consider Yahoo or the Open Directory Project. 3) Regional search engines and directories focus more tightly on regionally important topics. 4) Lastly, more specialized search tools, from search engines which focus on specific topics (like maths or government webpages), services which link you to important topic-specific websites, and services which manually review websites, all can take you further.
Tools to Find News 1) Current news is found in newspapers and the evening news. News clips can be delivered electronically, or purchased through specialist news clipping services. 2) Newswires redistribute regional news to a larger audience. Many newswires release their text news free online. 3) Specialized search engines like NewsBlip and TotalNews aggregate current online news. 4) State libraries archive past copies of regional papers. 4) Individual newspapers maintain libraries of previous articles. Many are available as commercial databases. 5) Larger commercial databases unite the news from many prominent newspapers. These databases of news articles stretch back many years.
This story is repeated with all the formats information comes in.
To drum this in with repetition, searching starts with a question. Select the format (book, news or webpage). Next, select one or more tools from our short list of search tools for that format. Want to understand the lifecycle of the spider? A book should prove useful. Let's look at either our local library book catalogue or a big commercial bookstore catalogue like Barnes & Noble (http://bn.com).
Search. Read. Voila, the lifecycle of the spider.
If searching appears a little boring at this point, you have not visited a library recently. The excitement comes in finding the information. The rest is dull indeed.
- - - - - - - - - - - - - -
The information revolution washes over us, picks us up and pushes us forward like so much driftwood. From now on our lives will forever be awash with information. We will eat it. Breathe it. Live in it. Drown in it. Some of us will even learn to live for it. Those most capable will have the skills to search, sift and sort information.
The information revolution is not about primary research, lab coats and discovery. It is about a surplus of information. The searching we have just discussed is not a particularly creative process. Simple searching is not sufficient to deal with the great tide of information moving against us. But then, simple searching lacks finesse. Simple searching is, well, simple.
Searching is one of those most delightful tasks where skill is everything. A search without talent will give you just a taste. Like pottery perhaps. Anyone can get something but only an expert can accomplish wonders. Quality information, reliable answers, effective coverage of resources; it takes skill to get to this level.
Advances in technology and the delivery of search assistance has made searching easier than ever before. Many search tasks can be accomplished without any experience. With more challenging questions a novice will get results - results they will be proud of. But not results they should be proud of. With experience, you will recognize how much more is possible.
Let's proceed by adding a little more complexity.
Searching is Complex Section 2
Your value as a searcher is directly related to the number of resources you can reach for quickly, and your skill at phrasing a research question. Consequently, as a searcher, you will work hard at building ready access to a range of resources. You also work hard at understanding the special characteristics of collections of information.
The technical name for complex searching is 'Information Research'. I prefer to think of information research as an effort to locate answers, efficiently. Information Research is not vague browsing of available information for something that interests you. It is not browsing the library bookshelf or reading the newspaper, nor is it internet surfing. Information research is searching with a purpose ... and it is hard work.
Research is also an art form. The skills, tools, and resources we work with are only the canvass and paints of an artist. Research extends from commercial, legal, reporting, through the skills of interviewing, database searching, and research analysis using books, articles, experts and patents. Research is so large a field, involving so many skills, tools and resources, you will quickly find you do not wish to learn it all.
At the heart of information research lies a simple motto: "Someone, somewhere, probably knows the answer."
To quote The Information Broker's Handbook (Sue Rugge and Alfred Glossbrenner): "As information brokers, we shouldn't consider ourselves capable of providing solutions... What we 'can' provide, and what sets a really good information broker apart from the rest, are resources. We can provide the client with the kinds of information he or she needs ... that make it possible for individuals to solve their problems."
Let this sink in. We are not experts in the field we are researching. Collecting information on the moons of Jupiter? Do not pretend to be an astronomer. We are only experts at the tools for gathering information.
A Quick Introduction to Effective Searching.
1) Searchers work hard to properly frame the question. 2) Searchers know the technology, know where to look. 3) Searchers know you can ask.
Step One: Properly Frame the Question The preparation of your question is critical. There is a galaxy of difference between a young student asking, "I am interested in trees", and a specific, attainable question like "Where would I find a tree surgeon I can talk to?"
The information sphere is very large and rather confusing. Each item of information has aspects of authenticity, accuracy, reliability, and bias. Information comes in many formats: interviews, books, articles, statistics. We learn about information from many sources: literature, discussion, resource lists, experience. There are also personal issues: budget, time, depth and purpose.
With all this to think about, we must be very careful about each question we ask. This issue is vital once we start an article search, and can easily mean the difference between 5 concise articles, and hundreds of general articles. The essence of our question is the manner with which we approach the information sphere. The question directs our efforts.
One key is to treat searching as an art, much like painting or photography. The true mark of an artist, and the primary step wanna-be artists miss, is visualizing what you want before you begin.
When searching, sit down and visualize what a successful search would look like in this situation. How many pages? How many documents? What kind of authors and what kind of quality of document? Go through the whole gamut of different types of research tools and describe it. Would a simple three-line newspaper article be a success? Would a 20-year-old dissertation be acceptable? Would a short conversation with an expert suffice? Would all three together suffice? (This approach works exceptionally well with internet research too.)
If you can phrase a question in a way that lends itself to your resources, you are far more likely to get the answers desired. Oddly, this often means you are asking for places where the information resides rather than asking directly for the information.
A novice starts with a question like, "What can I do for my exceptional child?" You should rephrase this question immediately. "What resources will help me help my exceptional child." These are both valid questions but the second question has a distinct answer - the first is far too vague. Other questions could be "What are other parents doing for their exceptional child?" or "Who can help advise me on how to teach my exceptional child."
Now we shape the question to get precise answers. "Where do I find a definitive list of associations?" (or a search for "+association +directory") works much better than, "What association works with exceptional children?" What about, "Who would know of associations for exception children?" and, "Are there pamphlets of advice for parents of exceptional children?" and, "What umbrella organizations/specialist libraries exist for exceptional children?"
Questions are not right or wrong, just better or worse at illuminating certain aspects of the answer. Make sure your questions illuminate something useful.
There are ways to frame questions for commercial databases, for research assistance, for interviews, for getting the truth from to your children. Your skill in phrasing the question has a lot to do with the results. Poor questions tend to come back and haunt us later when you miss relevant information. Set aside ample time to refresh and reframe your questions.
Step Two: Know the Technology, Know Where to Look. Research rests on understanding the technology and an awareness of the resources. In the example above, a directory of associations does exist. Here in Australia it is the "Directory of Australian Associations", found in most important Australian libraries. The Australian "Department of Education" has a major interest in promoting exceptional children. In Western Australia, Infolink, a community information service, should have a record of major community groups for exceptional students. I have no direct knowledge of umbrella organizations or specialist libraries, though I expect both the education department and Infolink would. A quick search of some large libraries may help us find some of the pamphlets.
Knowing of specific resources is helpful. It is great if you live next door to the president of Mensa. You have easy access to someone knowledgeable, able to give his or her take on the situation.
Knowing the tools to help you find resources, the meta-resources, is vital. So what if we do not know exceptional students come under the Department of Education. Do we know who to ask to find the government department involved? If you do not know of the directory of associations, who or where would you look for one? Being unfamiliar with meta-resources is a serious handicap - you will find yourself searching hours for something a professional would do on the phone while drinking coffee.
Keep in mind the Spire Project is dedicated to providing you some of this experience. Our web articles should suggest directions to look. But there are limits to how we can help. At some point you simply must sit down with the Kompass Directory, or the Gale Directory of Databases, or the Australian Bureau of Statistics library, and become familiar with getting to all the relevant information.
Another must, for all searching, is experience searching electronic databases with complex research queries - a difficult task only made better with practice. As a general rule, if you don't use Fields, Proximity and Boolean search terms, you are doing it wrong. Most people do it wrong.
Step Three: Know You Can Ask. There is very little mystery about professional research. Lots of people are experienced in different aspects of this field. My personal weak point is in direct interviewing where as I am a pioneer in secondary resource research. This is OK. In fact I use this liberally to determine the skill of professional researchers - do they know their own limits? The field is much too large to be an expert in all its aspects.
The positive site to this is many people welcome requests for help. I enjoy asking librarians questions. I also ask my customers, my suppliers and other professional researchers. Never get caught in the trap of feeling you know what to do. The joy in this profession is that most people do not expect you to be an expert in their field, just an expert in your field: particularly the meta-resources. Even if it requires a polite reminder, customers will appreciate you asking them for likely keywords in difficult searches. I always make a habit of asking librarians if I am missing something. A librarian is always fluent in their collections and I frequently locate real gems this way. (As an example, my state library arranges computer books in two sets, one Dewey and another in an alternative structure. Who would have guessed?)
Especially if you are just a student, always keep your ears open. You will frequently find yourself in the presence of some expert in some facet of research telling you something you already know. Consider carefully before you interject... Your expert may be about to explain something new to you.
Information research is a dedication to learning. At its heart is a collection of specific research skills, an awareness of research tools, and a gifted mind. - Oh, and a large amount of coffee. Without knowledge of and access to relevant research-worthy resources, your research will be severely limited and doubtful. This is why much of your work becoming an effective researcher involves learning about the resources and meta-resources for your field. Much of our work in the Spire Project is drawing your attention to relevant resources.
Before we progress to specific resources for specific formats (books, webpages, news), let us attack head on the role of the internet in information research. This should surprise you.
The Internet Format. Section 3
As Shakh became more proficient with writing, father wrote more frequently of the family deity. Horus, the falcon god, had long watched over his family. Horus sees all, his father would write, and even across the many miles separating you from us, Horus will watch over you and keep you close. It was a great comfort to Shakh to have the family deity looking after him.
Shakh too devoted himself to a life of watching and knowing.
- - - - - - - - - - - - - -
We have discussed how information comes packaged in certain standardized formats like books, articles or news clips. Each format has particular qualities and standards that reflect the way the information is prepared. For example books are dense, factual, comprehensive and a minimum of 6 months to a year old.
So how can we apply this newfound wisdom to the internet?
Let's start at the beginning. The internet is an inexpensive and pervasive system for the delivery of data. It is also the medium of a dramatic shift in the way we access information.
A (1) dramatic drop in the cost of publishing is fuelling (2) the liberation of information from previously closed systems, leading to (3) an emergence of alternative funding for certain public resources and (4) an eagerly awaited 'direct to consumer' commercial information industry.
The first mental knot to untie is the separation of internet resources into distinct formats. Electronic books share most of the qualities of books published on paper. News stories found on the web share all of the qualities of news in your local newspaper. The fact they are electronic or appear as webpages has nothing to do with it. News is news. Electronic books are almost books.
But if online news is news, and online books are almost books, and both are not internet formats, what is an internet format?
The search-by-format method is a concept to simplify and understand the many information resources which exist in the world. The concept is only as valuable as it is successful at enlightening us. As to the internet, we have more to learn, but could safely divide the internet into several formats at this time, perhaps webpages, online discussion and ftp resources. Yet this is largely superficial. The real value comes from understanding the qualities of different types of webpages. We shall divide the webpage format further.
Must we really learn this? You would be pardoned for equating searching and the internet. Much of the hype surrounding internet search tools builds the illusion that the skill of searching can somehow be distilled computationally then delivered to you electronically. Through the wonders of modern science, you can have the best information at your finger tips without having learn anything of search technology.
This is a pervasive lie (or marketing fiction). The electronic research industry has been around for decades and has worked on this problem for some time. No upstart internet guru has invented a technique to suddenly transform the search process. Such thinking would work in section two (Searching is Easy) but is the first illusion we must shatter for you to progress.
Case in point, Lycos and All-the-Web search engines use the same database of webpages. This database is growing rapidly, it stood at 350,000,000 webpages in June 2000 and hopes to reach one billion webpages by the end of 2001. It stands as a grand achievement in organization, right?
Wrong. Years ago I was using a unified database of news called Global Textline (no longer available but replaced by others). It had an astounding four billion news articles available for advanced text searching! Four billion news items, representing many years of news from all over the world. This was superficially 10 times the size of the current All-the-Web search engine.
No, the internet does not even hold the record for being the largest information field. Oh, it will surely surpass the quantity of commercial information, and superficially we could say it may already have achieved this. But the internet is not a new medium for information research. It is emerging as a new resource, not a new phenomenon.
The internet is a new medium for business - most businesses have never incorporated the immediacy or global nature of internet involvement, so considerable rethinking is required. The internet is a new medium for publishing for almost all of us; very few of us published electronically before the internet emerged. The internet is NOT a new medium for research. Information researchers have been working electronically for years. The internet is just a new resource we can reach for with strengths, weaknesses and peculiar traits we must appreciate.
By way of an example, let us compare Link Analysis as used in Google and Raging (of Altavista) with the process of editorial vetting as used in scientific journals.
Through the magic of link analysis, we can make certain assumptions about the value of a webpage by adding up the number of other pages linking to that page. In its simplest form, webpages with at least 100 inbound links from other websites are judged to be quality, valuable resources. A webpage without any inbound links has the suspicion of being of poorer quality. After all, no one has thought it valuable enough to add a link to their further resources page.
This logic has some serious shortcomings. Firstly, the process rewards long-term projects that have been online long enough to earn links. A brilliant new webpage would have few links - yet. It would be ranked poorly, undeservedly. Secondly, link analysis rewards websites over webpages. The pages with the most links are often homepages. Rating homepages over second level webpages works at odds to keyword searching. Our keywords will be found in specific, perhaps second-tier webpages. Links go to the top level. Thirdly, link analysis is a mass market, popular technique. You are banking on the intellectual finesse of a mass of mindless computer users much like yourself. It is the same kind of popular democratic selection that votes B-grade actors into the presidency.
Let's contrast this with the process of editorial vetting used in scientific journals. Each article is reviewed by a selection of knowledgeable peers who understand the topic is great depth. Each article is further improved by the editing of the journal editors, and by self-editing, for there is great competition and prestige at stake. Only a handful of the many submissions are judged worthy and appear in the printed journal. Success places the successful in the standard of record; stamped with an external statement of truth and importance.
Of course, the logic of editorial vetting also has shortcomings. Firstly, the process is time and effort intensive. Many of the most important journals will delay six months or more between submission and publication. In our digital era this is increasingly unacceptable. Secondly, the number of submissions accepted are at odds with the pace of development. So much more happens in the world than can be digested in this manner. Thirdly, editorial vetting supports the clannish behavior leveled against the upper echelons of science. New and novel developments have difficulty floating to the top if the peer review process should not be open to new ideas.
If link analysis is popular and democratic, editorial vetting is elitist and autocratic. Both approaches have pros and cons.
Once you have absorbed the drama between link analysis and editorial vetting, please do not retain the belief that your search needs will be completely solved for you. Searching is a complex, overgrown garden and its time to get your hands dirty.
So what does the internet have to do with searching? The internet changes searching in two ways. Firstly, the webpage is a new format to contend with.
"Webpages are often of unknown age, of only guessed at quality and potentially the easiest information to retrieve. There are many points of entry to web resources but search tools differ. Try to match your search tool to your question." (See http://spireproject.com/webpage.htm)
The internet is also a conduit to many of the pre-existing tools for searching other formats (books, news, interviews).
With an internet connection, we can reach database retailers and many commercial quality databases like LOCOC, ERIC, MOCAT and AGIP directly from the source. We can also remotely search the catalogue of most libraries in the world. These are not new resources, just new ways to reach them.
In this day of interconnectivity and change, it is too tempting to declare the information industry is in rapid flux. Everything I have learned suggests this is not so. There are some changes associated with new channels but by and large the process of searching for information remains the same.
Let's look briefly at news as an example. News articles are written by the reporter, sold to international newswires which then distribute these stories to interested newspapers and news channels, that incorporate the news into your newspaper or evening TV news.
Journalist - Newswire - Newspaper/News show - You.
News would also be added to commercial databases of past news. These databases are then provided to database retailers like Dialog or Lexis-Nexis who sell occasional access to you.
Journalist - Newswire - Commercial Database - Database Retailer - You.
With the internet, newswires have also provided their text news to online sites. Text news is thus available for you to browse or search.
Journalist - Newswire - Internet News Sites - You.
I draw your attention to several facts. The fundamental nature of the industry has not changed. Journalists and newswires still impart upon the news the same nature as before. It is short, shallow, immediate. It is created to journalistic standards.
If you wish to search past news, you must still reach for the commercial database, most likely through a database retailer. Searching for news online only goes back two weeks at most.
Lastly, to date only the text format for news is widely disseminated. Sometimes a couple of pictures are included but the visual news, as used in the evening news on TV, is sure to remain priced beyond public consumption.
So what has changed? There is another venue for you to pick up the news. There are opportunities for new databases to be created, some of limited time (like totalnews.com - a database of current news on other websites). Little else has changed. The creation and dissemination of news remains pretty much as before the internet arrived.
Let us look even more briefly at book publishing. Books are produced by authors, improved by editors, published by publishers, marketed by bookstores, then purchased by you.
Author - Editors - Publishers - Bookstores - You.
Today we have a couple of new online bookstores - and a large number of new old online bookstores (existing bookstores now selling online). We have a collection of free books online (largely classics like Shakespeare, which strangely, were immediately published as really inexpensive paperback classics available in airports everywhere).
There are also a range of very useful commercial quality book databases which have become free to search online. I am thinking the government publication catalogues (MOCAT [US], AGIP [Australia] and Stationery Office Online Catalogue [UK]) and the online catalogues for the Library of Congress (LOCOC) and the British Library.
Lastly, the online catalogue to the large bookstores like Barnes and Noble, Amazon and The Internet Bookshop (UK's WHSmith) can provide a free and fast database of books in print, though not as good as the commercial Books-in-Print databases. Of course, any local bookstore will offer to search books-in-print for you, so this is not as revolutionary as it might at first appear.
In summary, we have a collection of recently discounted book databases we can more easily search, we have additional sites to buy books, and little else. The creation and dissemination of books remains pretty much as before the internet arrived. Has the book industry changed? Not really.
The most remarkable change has been the emergence of group discussion online, the emergence of a new format for information (like the webpage) and the opportunities to connect faster to a whole range of pre-existing searchable resources.
This is the reason why we discuss searching-by-format. Later, at the end of this FAQ, we return to this topic and show that the real revolution is not in resources or industry or search tools but a revolution in immediate access. Access, it turns out, enriches the art of searching.
Pessimistically. On counterpoint, as an information resource, the internet can still be much too limited for many situations. If we are not careful, searching the internet becomes no better than browsing the shelf of your state library.
What most impresses me about the internet is the promise of changes in the future. The internet as a system suggests radical improvements to the current decade-old systems that have attained their search-worthy status. What impresses me most are the improvements mostly still in the future, not yet proven, set to remain promising ventures for a time.
This is not to say internet research can not be rewarding. In some fields like computer studies, the internet has already surpassed parity with books, articles and associations. Just when you will consult the internet as a research-worthy resource depends on cost, effort, and the quality of the information returned. This judgement call requires more than a little experience.
Value is important. I sincerely hope we can suppress our enthusiasm for free information in favour of a truer appraisal of the value of information. Make no mistake, commercial information is brilliant. It is almost heresy to even compare commercial information with the results of a few hours on the internet.
Internet Information Theory Let us agree the internet is great fun to surf but more challenging when you have a specific question in mind.
To improve our search skills, we begin by understanding how information is arranged on the internet. Contrary to myth, information is not disorganized but rather organized very carefully along clear patterns. Many patterns are specific to the information format (text document, webpage, email message, printed article). Further patterns match the way we become aware of information, or are specific to the information systems (mailing list, FAQ, peer-reviewed journal). Your understanding of the strengths and weaknesses of each pattern, each format, each system, guides your search for information. We shall start by shattering the internet, and commenting on the many pieces.
Three Definitions of the Internet Do be careful when using the word 'internet'.
1_ The internet is a physical network; more than a million computers continuously exchanging information. The internet allows us to transfer information around the world.
2_ The internet is a landscape of information available on almost every topic imaginable. This information appears almost chaotically distributed to the world but holds clear patterns. For instance, linking information together are various structures like government web links, search engines and FAQ documents.
3_ The internet is a community of 500+ million individuals. These are real people who choose to interact, discuss and share information online.
In this example, let me just draw your attention to the way most of our research effort focuses on the second definition: a landscape of information. Much of the best information originates in the third definition: the internet is a community. Sometimes it is far more effective to ask real people than search the information cyberspace.
What I just mentioned is not so important as the technique I just used. I broke the large seemingly chaotic system into smaller pieces: pieces that hopefully make more sense. Eventually, when we've made sense of the little bits, perhaps we can comment astutely on the big-picture.
Information, transaction, entertainment There is a triad of functions to all online activity:
Function - Activity - Unit ---------------------------------------- Information - Research - The Fact or Conclusion Exchange - Business - The Transaction Entertainment - Play - The Experience
Each internet function grows at a different rate and moves in a different direction. The development of forums is firmly in the smallest segment dealing with information. This segment is quite poorly organized and confusing. The entertainment function in contrast is well financed and graphically innovative with clear, profitable opportunities.
Much of the web is prepared with Exchange or Entertainment in mind. "Brochureware" (purely promotional webpages) is rarely required for research but is critical to securing a transaction. Entertainment related or just entertaining websites abound. Let us recognize just how few webpages are information & research related.
My own experience suggests we are just beginning to see the movements towards profiting from providing information. Direct selling of information is still chaotic and unrewarding.
Information Formats The way information is packaged has a great bearing on the content, quality and use of the information. This theme is evident throughout the work of the Spire Project, and is particularly applicable to internet information. Webpages, text files, software, email and database entries each have particular qualities. Each shapes, constrains and restricts the informative content. These particular qualities apply irrespective of the information involved.
Books are dense, factual, a little old. Articles are short, sharp, more recent. News is puff, introductory, immediate. Each way the information is packaged, each format, presents the information to set standards.
Information formats on the internet are the same. Webpages are graphical, technical to produce, and not easily updated. FAQs are easier to maintain, text only, and attract more peer review. Mailing lists are simpler still, text, short, immediate, very peer-reviewed, characterized by discussion and resource discovery. Newsgroups are characterized by extremely low costs, vulnerable to trashing, poorly managed. Email is simple use, one-to-one discussion.
Let's look at books more closely. Books are created by authors who have something to write. Books are printed and marketed by Publishers to the bookstores that then provide it to the readers. Each facet of this process defines the resource. Books have quality, editorial vetting but minimal peer-review, marketable value and a potentially lengthy preparation time.
When it comes to research, why look for a book when investigating digital money? Books would just have the wrong qualities - would present the information poorly. We need a more current format (digital money is a fast moving topic), and a more peer-reviewed format (books have editorial vetting but not intrinsic peer-review). Why not search for a mailing list, an FAQ, or an association website. These formats have qualities more appropriate to our question.
Information Preparation Information flows also impress patterns on internet information. Most information is transplanted to the web - first created elsewhere. The source of information imparts as much pattern as the eventual format the information takes.
Information may appear as a webpage, and conform to our expectations for all webpages but the information may have been prepared from the discussion on a mailing list - and thus enjoy a more topical, specific, timely and peer-reviewed quality.
Let's look at FAQs. The best resource in the world on copyright law is the musings of a group of copyright lawyers who form the copyright mailing list. The copyright FAQ supported by this group is a logical document summarizing much of the discussion of this mailing list. FAQs are vetted by the news.answers team, then automatically mirrored around the world. From its origins in the mailing list, the FAQ is a peer-reviewed document, often full of links to further resources, topical, knowledgeable and factual. As an FAQ, the document is not immediate, graphical or financially rewarding (some FAQs stagnate).
Only some internet information is created within the internet environment. The concept of 'brochureware' describes the common traits to promotional webpages directly prepared from paper promotional brochures.
One of the more exciting trends is the movement of information from the dusty shelves of government offices and association libraries to their more accessible websites. The quality of information retained in your average government agency, from quality research reports, to detailed studies, to current industry monitoring is very high. These qualities are then brought over to the web format. Such web-documents tend to be isolated (not linked to other related resources) and perhaps a little behind the time line but of a generally high quality.
An exciting holistic view of the internet information landscape is based on these descriptions. Imagine, for a moment, information flowing through a collection of systems. At certain points, information groups together, and generates new, perhaps higher quality information, which then flows in a different system, a different direction, to different people.
The flow of information from one person to another, from one format to another, imprints qualities to the information along the way. Each organization, or subsequent re-organization, imparts specific styles and conventions and quality to the result.
Publishing Motivation Let us proceed to a third set of patterns. Information appears on the internet for one very specific reason. Someone Publishes (DUH). The motivation behind publishing colours the information. This is a pattern we can use to quickly judge the contents of a webpage.
Ask yourself who is publishing, and why.
One of the biggest publishing segment a year ago were individuals publishing documents derived from their personal expertise. A typical document would be one with minimal peer review, a list of aging links to further resources, simple graphics, variable to short length, prone to bias but moderately reliable because the publisher knows their topic well. These pages are often located on web pages with private sub-directories (usually starting /~name/).
Commercial sites publish mainly for the promotional value. Their secondary purpose is to provide sales information to prospective clients. Rarely do commercial sites go beyond this. Commercial webpages often reside on their own domain name, as a .com, or in sub-directories - without the tilde symbol. Commercial sites also tend to age badly. They are very noticeable from their front page.
Government agencies are emerging as valued publishers. Slowly their dormant information becomes available through this new medium. Currently almost all government documents on the internet also appear in print, meaning they are factual, exhaustively reviewed, tend to be a little old (but age well), and come from highly paid knowledgeable people who believe it is their duty to inform others. Such documents are lengthy and appear on .gov domains.
These patterns are simple to see.
Grant-funded projects create brilliant research resources and hold much promise in pushing the limits of this technology. I am eager to see the results of the US Patents project, and appreciate the value of having Supreme Court rulings on the internet. Often such projects focus deeply on content. Most projects reside on educational servers and are widely discussed within knowledgeable groups.
Associations publish association-kind-of-things. Most are initially just like the commercial webpages. With time such sites become much more factual and research-worthy. Most associations are dedicated to developing awareness of their chosen topic, albeit coloured by their chosen bias. Few associations are significant publishers but in time, this segment will begin to liberate dormant information within associations.
Let's summarize. The key is to always watch who is the publisher. We can assume a great deal, quickly. We are unlikely to find the latest changes to patent law from government or commercial publishers. Such organizations are simply not motivated to present such information.
Promoting Information Publishing is one achievement but you and I will never read any information until we learn it exists. This simple fact creates even more patterns to internet information. Knowledge of information moves through set routes on its way from writer to reader.
Promotion is not simple. It is a process that takes time, effort and perhaps money. Information without serious promotion tends not to be promoted far from the source. Another way to phrase this; you must search close to the source to find poorly promoted information.
A search engine indexes pages relatively indiscriminately. This also means a site of quality is not likely to reach your attention. The odds are not good, and from a promotion point of view, search engines generate minimal traffic to your webpage. Search engines also drop you rather randomly into a website. It is often necessary to move up a directory to understand the purpose and motivation of a site you find interesting.
Information published through advertising tends to have a financial payoff for the promoter. This kind of information tends to be promotional information. Brochureware.
The alternatives are to promote a webpage or website through one of the referral tools. Each such tool accepts links on some criterion. Each tool you use to locate information also selects particular types of information for your attention.
If you arrive at a document by recommendation through a mailing list, the document is likely to be recent, on-topic and specific to the purpose of the mailing list. Alternatively, (for poor mailing lists) it will be wildly off topic and trash. You are unlikely to see referrals to old documents or documents of historical importance. These are the qualities most acceptable to the mailing list environment.
Directory trees, FAQs, guidebooks and related promotion tools all work as historically important documents. In the past, such resources list, describe and alert people to relevant information for the field. Slowly, over time, this function becomes acknowledged, reinforced and promoted. Time is the essence of this fame.
Webpages or websites found through historically important documents, by their nature, tend to be long lasting websites with lasting importance in the field. Such documents point to other similar documents or websites that have achieved a long-lasting importance. You are unlikely to find specific documents but rather sites that focus or bring together information. In short, there is little motivation to link to specific webpages, when a link to an important website is just as good.
Similar generalizations can be made of each type of promotional tool, and become important in rapidly seeking our information which matches our intention, as well as summarizing the likely motivation, and bias, of webpages we are interested in.
Information Clumps Information Clumps. Information is created, nurtured, develops, gets transplanted, gets arranged and then becomes visible through a process which brings similar information together.
As we have discussed, there are factors deeply affecting all information on the internet. Motivation, Preparation, Format and Promotion all define the quality and content of any given item of information. With so many influences, we should not be surprised to learn information naturally groups together. In reality, there is nothing natural involved - it is a social phenomenon reinforced each time you and I visit or read one resource but not another.
History can explain some aspects of internet development. As a small collection of sites become dominant in particular fields, by collecting and delivering better content to more people, new sites find it progressively more difficult to capture attention. This dynamic works for websites reaching out for visitors, and discussion groups reaching out for subscribers. In each case, seniority counts.
Seniority counts in several ways too. Promotion is directly related to quality, interest, traffic and time. The longer a site is active, the better the footpath develops, the more people visit. Secondly, quality content is directly related to access to quality content, peer review, and time/money. Important existing sites gain in every way.
This results in a grand system where the first-in, best-dressed, can capture the high ground and secure a grand lead in awareness and footpath over competitors who follow. Yahoo is a prime example of a directory tree, not even the best in most areas, which has achieved unparalleled traffic & awareness.
This competition is equally evident where no money is involved. Perhaps your association wishes to create a new referral website, or an open mailing list, or an informative guide. All sound concepts, effective projects. However, if older, established resources exist, the work will be long and arduous.
Despite the marketing message, the internet is not a world where the best information floats to the top. The internet will not let you to reach millions. You must compete for the attention, participation, devotion and assistance in a manner very similar to building a business.
In concrete terms, information clumps on the internet. The best resource could appear on any internet system (webpages, email mailing lists, ftp-archives, FAQs, online databases, newsgroups...) but we can be fairly certain the best information will congregate in just one or two. Consider this as an application of the 80:20 rule. 80% of the good information will be found on 20% of the formats, arranged concisely by 20% of the search tools.
Consider our article "Searching the Web" (http://spireproject.com/webpage.htm). We progressively search different web tools, looking for the most worthy. Searching the internet is the same. You must touch each system to see which system is dominant, where the information is congregating for your topic.
Bringing this together In summary, we have broken down and discussed various qualities of published information and promoted information. We have made sweeping generalizations and educated guesses about information on the internet. Now what?
When a painter begins to paint, they have already visualized some of the image. They already have a concept of the finished result. Internet research is no different. We start by building a vision of the information we seek. Who would publish it? Where would I find it? What is its motivation? How would we find it? We now have a practical vision.
The address is one of the keys. The web address (or URL - Uniform Resource Locator) for any item of information gives us a surprising amount of information - particularly as we are making generalizations about information patterns. We can guess if information resides on a personal webpage, a funded university project, or a commercial project. The information resides on a .gov website? - the quality is likely to be higher and conform to our expectations of government resources.
We use this new-found experience in three ways. Firstly, we restrict our searches to the most likely sources. Secondly, we quickly jump through lists of resources (such as those generated by search engines) to the sources that match our expectations. Thirdly, our assessment of information quality can be guided by our snap-judgements of its origin and purpose.
Internet newcomers often expect to have instant access to the latest information at the touch of the button in beautiful colour and peer reviewed quality prose. Who is publishing this? Where is this information coming from? Who would help us find this? Such a vision is fantasy. If we were instead to look for an association website, dedicated to a certain type of research, or an informed newsgroup, maintained by people passionate about sharing this technology, then we have made four steps forward. We are clear about where to look for the answers we seek, and we will know quickly if the answers are online.
Let us now leave this discussion on internet organization and internet theory. This is tough newly discovered territory, more than a little rough. I fear it will make most sense to people with considerable experience with the internet. Let us now explore the fertile grounds of understanding more familiar formats like books and news. ___________________________________________________ This document continues as Part 2/6 ___________________________________________________ Copyright (c) 1998-2001 by David Novak, all rights reserved. This FAQ may be posted to any USENET newsgroup, on-line service, website, or BBS as long as it is posted unaltered in its entirety including this copyright statement. This FAQ may not be included in commercial collections or compilations without express permission from the author. Please send permission requests to david@spireproject.com
|
|
 | | From: | David Novak | | Subject: | Information Research FAQ v.4.7 (Part 6/6) | | Date: | 29 Dec 2004 05:28:15 GMT |
|
|
 | Archive-name: internet/info-research-faq/part6 Posting-Frequency: monthly Last-modified: April 2002 URL: http://spireproject.com Copyright: (c) 2001 David Novak Maintainer: David Novak
Information Research FAQ (Part 6/6)
100 pages of search techniques, tactics and theory by David Novak of the Spire Project (SpireProject.com)
Welcome. This FAQ addresses information literacy; the skills, tools and theory of information research. Particular attention is paid to the role of the internet as both a reservoir and gateway to information resources.
The FAQ is written like a book, with a narrative and pictures. You have found your way to part five, so do backtrack to the beginning. If you are lost, this FAQ always resides as text at http://spireproject.com/faq.txt and http://spireproject.co.uk/faq.txt and with pictures at http://spireproject.com/faq.htm
*** The Spire Project also includes a 3 hour public seminar titled *** Exceptional Internet Research. This is a fast paced seminar *** supported with a great deal of webbing, reaching to skills and *** research concepts beyond the ground covered on our website and *** this FAQ. http://spireproject.com/seminar.htm has a synopsis. *** I am in Europe, seminaring in Ireland and Europe though I *** will be returning to the US shortly, and South Australia for *** a seminar this October.
Enjoy, David Novak - david@spireproject.com The Spire Project : SpireProject.com and SpireProject.co.uk
Searching as Industry. Section 9
Of interest to you now, the internet offers you a very good look at the information industry. Most organizations involved in the information industry publish exhaustive product descriptions on the net. Most commercial products are delivered electronically.
Professional Search Resources
As a profession, researchers have diverse skills and needs. Constantly working with information, in a competitive market, professional information seekers are often starved for high quality information about new research techniques, skills and sources. This can be found through discussion groups like BusLib-l, websites on library science like LisNews.com, associations like the Association of Independent Information Professional (AIIP) and the Society of Competitive Intelligence Professionals (SCIP), events and conferences as listed in the journal Online & CDROM Review.
As a more introductory resources, start with the a selection of books and webpages like: - The Intelligence Cycle[1], courtesy of the CIA library - a single-page summary of the research process.
- The Information Broker's Handbook by Sue Rugge and Alfred Glossbrenner, McGraw-Hill. Third Edition (1997) - a must-read for those interested in the business side of information research.
- Secrets of the Super Searchers by Reva Basch. Unfortunately a 1993 book, but unique as a look into the field of information brokers. Published by Eight Bit Books. (Dewey 025.524 BAS)
- Online is a good bimonthly magazine for information brokers. (Dewey 025.04).
There are a number of interesting periodicals, most owned and marketed by Information Today Inc. BUBL lists a number more [2]. Others are electronic publications, like LIBRES [3]: Library and Information Science Research Electronic Journal, a biannual scholarly journal and Information Research [4].
The commercial databases of interest are LISA (Library and Information Science Abstracts), ALISA (Australian LISA), Information Science and Library Literature.
The links for these resources and more are on the Spire Project at http://spireproject.com/links.htm#3
[1] http://www.odci.gov/cia/publications/facttell/intcycle.htm [2] http://bubl.ac.uk/journals/lis [3] http://aztec.lib.utk.edu/libres/ [4] http://www.shef.ac.uk/~is/publications/infres/ircont.html
- - - - - - - - - - - - - -
The Professional Search
Professional research demands a more effective, timely use of resources at hand. It is challenging, and it is an occupation.
Unlike research undertaken for your own needs, professional researchers often know little about the topic they are asked to investigate. We may not know the phrases which accurately describe a specific concept, we sometimes don't recognize gold if its labeled copper, but we have to do everything fast - lest the cost escalate above the expectation of the client.
Client? Yes, professional research starts with the client.
Professional research involves far less book and library work, and far more interviewing, database access and online article purchasing. When money is involved, time becomes very precious. The first luxury lost: the luxury to get to know the topic in leisurely detail.
Instead, professional research starts with a careful description of exactly what information is desired (and why). You must quickly build a good plan about who you will ask and where you will look. This is, after all, your primary skill others have great difficulty in duplicating - traversing the information sphere swiftly and skillfully.
Many researchers today can search databases. Most researchers are familiar with library work. Personal research has the added benefit of being part of the learning process. So why reach for a professional?
The first unique skill we must refine is our knowledge of the research tools. Computer databases may be easily accessible, but are not easy to search. Interviewing is conceptually simple, but is not simple in practice. Each aspect of research can and must be refined.
The second unique skill: interpretation. Working with information frequently allows us to better judge the reliability and bias of the information we retrieve.
Most information you find will be tainted. Secondary expertise almost always present information in a biased way. You will counter this bias both by being aware of the bias and by interviewing someone with a different view. An inventor proclaims a devise in near completion - do we believe? Obviously it requires further study. This is often lost on amateur researchers - by collecting information from a variety of different resources, with a range of bias, we can create a superior assessment of the value of each item of information. Research based solely on government research, no matter how well done, is unprofessional.
The third unique skill is speed. We must be able to provide research as a service, as a business, quickly. This goes beyond research to the banal work of copyright and legal protection, selecting effective research tools, finding fast expertise to supplement your own.
The skills of professional research are like the artist. They take a lifetime to learn. The work is just business.
- - - - - - - - - - - - - -
The Database Industry
The commercial information sphere existed in the 1970's and earlier. It is far more developed, far better organized, far better funded, almost always far more valuable and expensive than every other research resource.
For the most part, commercial information is arranged reasonably uniformly in large databases of full-text or bibliographic information. Some databases are small, single source documents, while others are vast unfocused collections of, for example, all the news from the last 15 years.
Most directories and journals can be made into a database, but single-source databases do not enjoy much financial success. The market is too limited and the cost of promotion too high (except in a local market with newspapers). To overcome this difficulty, single sources are grouped together into larger collections of databases on a particular topic. These large database groups have become primary tools in commercial research.
Developing these databases requires considerable expertise and expense. Sometimes data requires abstracting, interpreting, and as with some Lexis-Nexis and WestLaw databases, even expert legal interpretation. Sometimes firms develop a portfolio of databases. Sometimes firms build just one.
The marketing and consumer billing of such databases is then provided by a relatively small collection of large database retailers. A list can be found in our "Commercial Databases" article. As an indication of the size of this market, Knight-Ridder sold Dialog & Datastar for a figure approaching half a billion dollars.
This industry consisting of a wide collection of players, each improving and developing the information from individual periodicals, journals, news items - all very confusing for the end user. This is elegantly illustrated by the database descriptions for Lexis-Nexis databases (their preferred term is libraries). See http://www.lexis-nexis.com/lncc/sources/ as an example of specific databases. In particular, see their library on patents.
Many single-sources appear in different commercial databases. Further, different databases sometimes include different information from the same single-source. One database may include just abstracts, another may include fulltext, chemical indexing and more.
As a result, most researchers are unfamiliar with what exactly is being searched.
This state of affairs is not unproductive. Searching a 'Database about Patents', is uncomplicated. You receive information on patents. It is simple, informative and incomplete. Of course, researchers are busy people. Time is critical. Results matter. We are familiar with this system from searching the web too. Just what are the differences between All-the-Web, Lycos and Altavista? If we fully understood the complexities of each available database, yet still have a few databases to consider - would our search be better? Often not. This system of incomplete information also leads to great customer loyalty to database retailers. Comparative information is dropped in favour of simplicity. Ultimately, I am hard pressed to compare prices let alone describe the differences between information products.
Prices actually model many a developed industry, remarkably similar to the telephone or banking industry. As one friend commented, "bullshit baffles the brains". The prices are complex on purpose. It becomes very unrewarding to compare prices, and any conclusions are only valid in specific circumstances - and will not hold in others. This trend, familiar to us as a multitude of banking changes and telephone pricing schedules, reinforces our need to stop price hunting and trust our favoured information retailers.
This is not to say we should not compare prices, just that you will find comparing prices a most unrewarding experience. It really requires you to search and retrieve the same information on different systems - and this does not even begin to touch different databases, or database groupings, or variables that change over time like download speeds.
Optimistically, there are actually very few important databases in each field. It may be simple to browse each of the databases in your field and compare directly. You may never need to know more than a few databases intimately.
Realistically, you will yearn for a simpler solution.
The commercial information industry has distributed information this way for several decades. It is both sophisticated and quite difficult. You will need to become experienced with inverted indexes, search techniques (Boolean, truncation, proximity, field limits ...) and properly phrasing the question in a way that will be answered by a database search. I have always found the value of a database search directly proportional to the length of the search query.
If you are incompletely skilled at database research, you will take longer, pay more and locate far more information (or unwisely discard more) than desired.
This is very different from searching Altavista and Webcrawler.
Doing your own research offers an opportunity to more closely influence the research process. Sometimes only you understand the topic and sometimes you can more quickly discard unimportant details. Certainly it is becoming simpler to undertake some work yourself.
Many of the commercial databases are also available in a CD format. Substantial subscription costs limit their availability to large research institutions and libraries, but exceptions exist. I believe world books in print costs AU$5000+. Provided you can find casual access, it will cost you far less. Keep an eye on the age, though. Sometimes (and only sometimes) online information is more recent.
The decision between undertaking research on your own or seeking external help is really a decision based on your research expertise, your budget, your access to information, your time, and the importance of finding all the information available. It also depends on your access to some decent research assistance. I will soon be able to help with this.
What I do know is a newcomer to the commercial information sphere will seriously underestimate the difficulty involved in searching, and underestimate both the cost of research and the cost of research assistance. Keep in mind this same system serves the needs of large commercial conglomerates, professional legal research, and well financed government studies. The commercial information sphere contains far more valuable information than you need. Sometimes the internet is just an interesting sneeze in comparison.
¤ Article: The State of Databases Today:2000 by Martha E Williams, tracts the development of this industry with survey results. Found as the foreword of the Gale Directory of Databases.
- - - - - - - - - - - - - -
Squeezing the Info-Broker
I was reading an interesting article by Anthea Statigos in ONLINE [1] that stirred me to thinking about the future of Information Brokerage. The article in question outlined the shift of information brokers into the marketing department, towards new roles in negotiating information access licenses, helping people understand and select appropriate resources - and oddly, in overseeing the intranet development process so as to deliver the information people need.
The article premise is rather accurate - as far as it goes. But I wonder if the true message behind this shift is the decline and death of information brokering as a profession? If information brokers (also known as information professionals) are moving to new roles, are they vacating the old roles, the traditional roles in the research process?
In my library, I reach for the Information Broker's Handbook [2] for a relevant quote:
"The heart and soul of the information broker's job is information retrieval. But many individuals offer information organization services as well."
So, Information Retrieval, and Information Organization. Anyone who has seen the simple information retrieval options incorporated in recent information packages can be in no mind that the information retailing industry is certainly minimizing the need to reach for an intermediary. Technology is certainly closing the gap - but this development has always been in the cards.
A central difficulty for information brokers is a simple maxim: provide better results than clients doing the search themselves. Often working in unfamiliar territory, a researcher may find it very difficult to excel. There are two dilemmas here. Firstly, while we may pride ourselves in accomplishing unique requests, we have expensive costs associated with one-off searches. There is little likelihood someone else will ask a similar question. There are simply no possible economies of scale.
Secondly, our search difficulty is not shared by the client. The client has difficulty with the technology - certainly. The client does not have difficulty with recognizing the wheat from the chaff, the gold embedded in the articles and at a basic level, the search words you will need to get to the right stuff.
There is a very good reason why university students are pushed to learn basic and sophisticated search technologies.
There is another take on this story.
Creating Value in the Network Economy [3] includes a chapter by Philip Evans and Thomas Wurster.
"emerging open standards and the explosion in the number of people and organizations connected by networks are freeing information from the channels that have been required to exchange it, making those channels unnecessary or uneconomical."
"Newspapers and banking are not special cases. The value chains of scores of other industries will become ripe for unbundling. The logic is most compelling - and therefore likely to strike soonest - in information businesses ... All it will take to deconstruct a business is a competitor that focuses on the vulnerable sliver of information in its value chain."
And in the back of my mind comes the thoughts that maybe the information retrieval function we have been providing is just one such information business. This business, attempting to be the pinnacle of the research process, is ripe for unbundling. Not only can our function be incorporated directly into the advertising and technology of the information resources we use, but our skill can also be coded into simpler and simpler guides and resources like my work on the Spire Project.
Perhaps as an industry we never managed to secure our captive market.
Initially, this will affect that mainstay of information brokerage: commercial database retrieval. And like the newspapers that will begin lose the profit center of classified advertising (ripe for unbundling and delivered electronically,) additional pressure will be applied to the business of providing information research services.
Eventually, we retreat to other areas as information professionals: Information Organization, Research Education and Training.
Somewhere in amidst this story lies a new role for researchers. The need for research certainly exists and is forecast to grow dramatically as the information age develops. What is lost, sadly, is an understanding of the ease at which this work will be done. This is certainly destined to move away from being an industry for professionals working at $50/hr to $150/hr + costs! Others can provide this work, easier than now. People we will most likely call researchers - and not information brokers.
This is more than a push towards specialization. There is another way to see this transformation. The information broker was a retail point for wholesalers who are now firmly selling directly to the consumer. There is much less of a need for an intermediary between database retailers and information consumers - and there is a firm trend in this direction.
Information brokers defined their role in the information industry as masters of the difficult technology of research, capable of finding most anything. Come to us when you are lost and we will find the answers - for a price. We know the technology, the meta-resources, the tricks used to find information. We routinely retrieve a higher quality of information, far faster, than you can yourself. The standard model: a library run service offering primarily database search & retrieval for their patrons.
This business model is coming to an end.
Yes, perhaps the information broker is dead. Soon to be replaced with low-wage researchers and research assistants, and high-end information executives and research trainers. Like it or not, most of us will incorporate a little more research into our current work, and reach for a little more intelligible research resources. Everything else will be accomplished by true specialists.
[1] Online (a periodical with some coverage of library & information research. July/August 1999 p71-73, by Anthea Statigos of Outsell Inc. [2] The Information Brokers Handbook p.21, by Sue Rugge and Alfred Glossbrenner. Windcrest/McGraw-Hill. 1992. [3]Creating Value in the Network Economy, Edited by Don Tapscott. Chapter 2: Strategy and the New Economics of Information by Philip Evans & Thomas Wurster. p.18 & 25. A Harvard Business Review Book.
Information Theory. Section 10
The Information Service Industry Private Detectives, Professional Database Researchers, Library Researchers, Legal Researchers, Commercial Database Producers, Commercial Database Retailers, Magazines, News Organizations, Libraries, this is a big industry. Information Research is just a process linking together people seeking information with people who provide it.
It seems in vogue to reconsider all businesses as being in the information business. My accountant and your stockbroker both provide information services. While I agree these two professions are intensive users of information, I purchase their interpretation of information. It is not a trivial difference but nonetheless serves to cloud the true size of the industry just involved in selling you access to information.
From university days, I was aware of the large commercial database retail giants (Dialog, Dun&Bradstreet) and the database producers. I also met with some of the firms distributing largely to the library market (like SilverPlatter). Little further information about these businesses leaks beyond the research industry.
Some of the businesses are aimed primarily towards the library community. Database subscriptions are unlikely to interest an individual. Few are appropriate to businesses. Let us briefly scan just the products and services intended for a consumer.
Commercial Database Retailers - These organizations devote their effort at bringing commercial database information to individuals. Dialog, Datastar, Infomart, Lexis-Nexis and others will assist you to access information only available through commercial databases. (See our article, "Commercial Databases".)
Current News and Current Awareness - If you want to know of new articles and news important to you as it is reported, then there are a selection of services available: news by email, news by newsgroup, news by periodic automated database search, and other novel approaches. Costs for this service have fallen dramatically: effective solutions start at about US$10/month and are not strictly dependent on range & quality of information. (See our article, "Newswires & News Databases".)
Information Brokers - There is a whole industry of specialized researchers who will try to locate and compile research to your specifications. The backbone of this industry is payment for access to commercial databases, but different information brokers will gladly enter into any effort required to locate information. Information brokers, business librarians, legal researchers and others all use the tools described in this website, as a service for their clientele. (See our article, "Research as a Discipline".)
Patent Assistance - Patent searching is one of the more difficult branches of serious research. Some of the resources are free on the internet, and commercial patent databases are readily available through the database retailers. If there is serious money at stake, you must consider legal assistance. Certainly use lawyers for patent applications (beyond the scope of the Spire Project). But a patent can also be a research tool. Patent research can provide you with what is often the first appearance of costly commercial research. This is both a source of cutting edge solutions and competitive intelligence.
Media Monitoring - Certain firms solely focus on monitoring TV, radio & newspapers. These firms typically run teams who page through newspapers looking for matching articles, then post or fax to the client. New technologies are also advancing into this field.
Document Delivery - Most local bookstores will gladly help you locate a book from their directories but if you want a book from abroad, or an article from a journal or magazine, you will need the assistance of another set of information workers. A distinct but similar approach assists with the distribution of journal articles. Many of the document delivery firms are closely tied to information organizations. Little information is available about these organizations.
- - - - - - - - - - - - - -
Trends in the Information Sphere For the past few years, individual database owners/maintainers have been flirting with the idea of making paid access available through the internet, rather than the existing system of allowing database retailing firms to promote and market their databases. I have heard rumours most database producers earn up to 30% of retail price when delivered through database retailers - 70% being retained by the database retailer.
The internet is not a commercially viable alternative...yet, but some databases have emerged with alternative funding despite this (Library of Congress, ERIC, Medline). Others are creeping in around the edges by offering subscribers access at a much reduced flat annual fee (Computer Select at one time). I expect most database producers are waiting for a meaningful way to charge. Digital money holds the key but despite the hype, practical use appears to be a medium to long-term reality.
A second trend is internet publishing itself. Gradually, the information is getting easier to locate. (Don't laugh please - its undignified.) We are also getting better at using the internet as a tool to disseminate information. We have the very visible, if perhaps short-lived, search engines but also other efforts like archives of FAQs, archives of guidebooks, applying the Dewey decimal system to the internet, specialist directories, subject guides, specialist search engines. This will be a lively field for several years to come. As it gets easier to locate the good information, perhaps the lines between commercial quality and internet quality will begin to merge in places.
The third trend is the very promising prospect of paying for information by the page through the internet - viewing the results in a web page immediately. There are some technical hurdles yet, but certain elements are already appearing in ventures like DialogWeb. This step may prove profitable for ATM vendors and owners of internet cafes, pubs and kiosks. It will also herald a dramatic drop in the cost of information.
- - - - - - - - - - - - - -
Are We Developing an Informative Internet? Several serious glitches have delayed the further improvement of the internet as an effective information resource. Oh, sure it is the world's largest library and thousands of new webpages are published every hour. But this trite statement disguises how slow the informative value of the internet is developing.
Vision: The internet holds so very much promise. Marketing mantras tell us so, but few of us grasp this technology will completely rewrite the rules of community, government and the exchange of intellectually valuable information.
One of the hurdles is vision. We are not yet delivering the information pertaining to community, government and the exchange of intellectually valuable (improved) information. We are only proceeding quickly with market information and computer-related information. We are still toying with further ways the internet can transform other areas of our life.
We should have achieved more by now.
Organization: The net is still very disorganized. A number of developments promise to eventually make the internet less confusing and better organized. To date, we have several cumbersome techniques, a large collection of search tools and a great deal of potentially interesting links.
Publishing: As mentioned, thinking about who is publishing assists us with our search. Applying this to where information is emerging - and we learn much of the best information is not reaching the internet. Certainly, the commercially generated information is not reaching the internet (covered below). The large research studies paid for by public funds and slowly aging on the shelves of government and non-government organizations are also not coming online. Government, institutional and commercial organizations primarily publish brochure-ware - as befitting the presentation of market information. (Even offering to publish such documents freely does not appreciably affect this trend as the restrictions are not financial, but mindset. See our past work.)
We should recognize few of the more valuable documents emerge online.
Further Reading: Socially Responsible Publishing on the Internet ('97) (Available on request) A Census of Regionally Important Documents on the Web ('96) (Available on request)
Discussion: The internet excites me with the promise of a real community rebirth arising from this technology. For the first time in history we should be able to discuss in an informed manner any number of issues from crime to taxation. Tied into this are issues of government transparency, international assistance, anti-corporate market reform and community involvement. Unfortunately, my experience with mailing lists and more recently with a newsgroup confirm the difficulties in developing discussion. Discussion groups function as notice board. Unfortunately, the difficulty in developing participation, and in moderation, are just a little too cumbersome to be successful. For many discussion groups, the chaff overwhelms the wheat, and the information content is far from considerable.
The financial rewards are also minimal for establishing and maintaining discussion groups. Dramatic improvement to the informative value of the internet is unlikely to emerge here.
Further Reading: How to build a discussion on the Internet (by David Novak - available on request.
Rewards: We have alluded to the importance of editorial and organization on the internet. There are several severe limitations to this - first and foremost the difficulty in gathering financial rewards for meaningful work improving and organizing information.
I am being circumspect here. There is money available - just not where it is needed. The most important resources in professional research are the contents of the commercial information sphere. This sphere existed decades before the internet, is far better funded, and is far larger. To compare commercial and internet information is almost heresy. A bridge between these two, internet and commercial, emerges slowly.
Digital money should grease the exchange of information by dropping the cost of exchange considerably. Today, credit cards provide this service. This works, at times, but digital money would allow for small amounts of money to change hands. This appears to be a critical threshold for bringing much of the commercial information to the net.
About 5 years ago I was introduced to the Thesius Model - an economic model to pay the intellectual investment in publishing and organizing interactive multimedia. Years earlier there was Xanadu. While I have serious reservations about both, they do illustrate the intellectual foundations for effective use of a tool for exchanging small amounts of money. It opens the doors to direct delivery of copyright work - which in turn opens an effective economic model for publishing improved information on the internet.
Without digital money, proprietary information can only be exchanged digitally by gift (that is free - the initial driving force of the internet information sphere, or by credit-card purchase of access to passwords to external networks - the current method of accessing database retailers.
This has the unfortunate effect of limiting the interest both of internet users in the commercial information sphere and the commercial information retailers in the internet. Oh, there is movement in both directions, but not at the scale experienced in other industries.
Further Reading: The UWA Theseus Project (http://www.arts.uwa.edu.au/TheseusWWW/) The Xanadu project (http://www.xanadu.com or concise summary - http://www.sfc.keio.ac.jp/~ted/XU/XuPageKeio.html)
- - - - - - - - - - - - - -
A Look at Information Congestion Finding information on the internet is a skill. Finding information on the commercial information sphere is also a skill. There is a great degree of overlap. The awareness of the general public as measured by use of commercial resources is very limited. This is further seen from the simple use of search engines & the abundance of simple web search.
To hammer this point in, let's take a momentary look at search engines. Most searches end in 1000's of results: here are the first 10. Do you really think the first 10 or 20 or 100 sites listed are particularly better than the next? No - you have a random selection of resources. A selection generated by computer based on the most simple of criterion. (We should also mention how some search engines sell placement in search results).
Remarkably, the search engine is the much-vaulted entryway to the world of information!?! Clearly search engines will not dramatically improve the informative value of the net - not by themselves.
Multiplication of Information One complication of poor information organization is an inflation of information overlapping nuggets. Information on the internet is so difficult to locate we have almost a continual need for more publishing. Information must exist in numerous locations to reach an intended audience. Promotion of the simplest nature - recognition for the best for a given topic - becomes exceedingly difficult. Only when 20 sites publish or report a given fact does it become accessible.
Curiously, this is the state of affairs in the wider community. Promotion is an expensive specialty. Numerous copies, distributors and references are required to generate any kind of significant awareness. Why should the internet be different?
Actually, why should the internet be the same? Definitive like the US Census Bureau have no need to duplicate this information; to have alternative presentation sites. Yet such sites appear the exception. Consider a search for the best resources for patent research, we are greeted with 954 websites (Altavista search for "patent research" Jan-19-2001). Presumably, most of these sites discuss patent research - Right? There is no technical or theoretical need for such confusion. I wonder if such duplication may be more of an affliction than natural tendency.
Justification: It is relatively difficult to earn money from publishing improved information, or organizing information already on the internet. Given the intense interest in this technology, a collection of models have emerged. A brief tour of these models will highlight the financial limitations to improving the internet as an informative resource.
- - - Working for fame (but not payment) This model works well in open source software programming, and some of this ethic certainly extends to publishing information. Simple altruism/complete lack of justification School students and internet novices in particular may not need to justify anything. Unfortunately, such work is usually neither consistent nor persistent. - - - Commercial promotion Promotional funds can be used to publish information. Most promotion is short-sighted, limited to presenting market information (like product information), but in time government and associations will fund publishing in-house information for purely promotional reasons. - - - Invested commercial businesses There are certain commercial opportunities to earn money through banner advertising and sponsorship.
Direct payment for improved information (perhaps with digital money), direct payment to authors (Theseus model, royalty systems), and direct state sponsorship need not be necessary to fundamentally improve the internet as an information resource. Academic peer-reviewed journals do not pay for articles. Commercial periodicals are supported by advertising, and the token subscription costs of magazines usually just covers distribution costs. Fame motivates many efforts, not just online, and we do not feel the need to habitually justify everything we do.
In no small way, as more people become adept at publishing quickly, important information will move on the net faster. Similarly, information will also gradually become better organized. Economic models will not improve the informative value of the internet like direct payment. Most current limitations have economic solutions. Unfortunately, my reasoned opinion is no economic system will arrive in time to make a difference.
Conclusion We know something of how information gets published, and how many important documents do not reach the internet. We have described how information is organized on the internet and how limited editorial vetting and organization have given rise to certain traits which give rise to the traits like superficial indexing, information duplication, and a need for research skills.
Financial rewards and financial tools are unlikely to solve these difficulties. We can only hope for a gradual growing out of our current difficulties. We will have more of the same for several years to come. It is simply the nature of the internet (as currently constructed).
For you, a greater understanding of the internet will assist you to judge the worth, likely source and likely venues of the information you seek. The same is true in the larger world... database, book & article. Each has different traits and qualities, reinforced over time. Your understanding of these traits and qualities in part defines your skill as a researcher.
As to the future of the internet, on the positive side, there are certain qualities to internet communication that make it uniquely valuable. Internet communication is inexpensive, relatively rapid, and increasingly accessible. On the negative side, the internet is badly vetted, potentially very time consuming, and up against very well entrenched systems that have been running for either decades or millenniums (considering databases or books). Elements like a promised but functionally absent digital money, and the lack of a meaningful way to recoup the costs of vetting online information, make matters worse. Despite this, despite ALL the teething and fundamental difficulties, the internet is sufficiently superior to ensure considerable continued effort to improve the informative value of the net.
- - - - - - - - - - - - - -
The Multiplication of Information Effect. Just as the internet permits a multitude of voices and perspectives, so it permits - and promotes - a multitude of the same information. Yes. For a several reasons we shall explore first, the internet multiplies the amount of information there is on a topic. This insight can be used to improve searching for information, as I will show at the end of this article.
The internet is a system of communication. Like all other systems (books, articles) the internet systems affect the way we communicate in different ways. The absolute number of books depends on what is thought can be commercially viable. We could say books permit, and promote a limited number of books on the same topic.
The internet does the opposite.
The sheer ease of publishing information on the net is one factor in information overkill. The net is an easy place to publish information, requiring only individual effort. There is no budgetary concerns, nor does attracting an audience initially enter into the publishing process, as they would with articles or books.
The ageless state of the internet also rapidly builds information. Old information is not removed from the web automatically as in mailing lists. Old books go out of print and past magazine articles are shelved, indexed and categorized so we must intentionally include them in our search. The web is not built this way, and information well past its natural expiry date remains.
A dramatic change is also occurring as our society becomes digital. In the pre-internet economy experts and specialists in every field are distributed to meet needs. In the networked world, expertise is not only shared more rapidly, but is required in less places - whether we speak geographically or intellectually. Said another way, in cyberspace, competition for expertise is most fierce. To be an expert, you need to be more expert than others within reach - and since gradually more and more experts are within reach - digitally - we form a glut of experts.
Oh, this is not a doomsday message - merely a middle ground on the way to increased specialization and focus. Historically we can easily see Newton was a Scientist but Einstein was a nuclear theorist. Today we have quantum theorists. The future is full of very long job titles.
A by-product of this movement is a current glut of experts - perhaps a permanent glut of experts. With more people connected and satisfied with distant communication, a vet who writes about immunizing your dog becomes one of many you can reach for, in several countries. Previously we may have been limited to those in your state - but no longer! Now we can pick up immunization recommendations from any number of experts previously separated by distance or with minimal overlapping media outlets.
We can see this clearly on the web. I wrote an article on country profiles and yes, as expected, the UK, US, Canada & Australia all write and publish traveler advice notices on the web. Are they different? Occasionally. Is this a case of multiplication of information? Yes. We have reached beyond the applauded internet trait of permitting a multitude of communication and reached a state where similar information is interpreted by different organizations, and distributed electronically.
This is not unique to the internet. News stories also contain considerable overlap from one newspaper to another. A search for dog immunization on one of the large news databases will result in numerous articles all presenting essentially similar information. Business periodicals also have considerable overlap, and while each may attempt to differentiate their articles from others, there are severe limits - and besides, most likely articles do not have an overlapping clientele.
But on the internet, there is overlapping readers. An article written for the web is an article written for everyone. Anyone can read it. Thanks to the popularity of search engines, it can be available to anyone. At least in theory.
This leads us to internet promotion. Information on the web is sometimes so difficult to locate we have an almost continual need for more publishing. Real traffic is difficult to promote normally, so websites devoted primarily to delivering information have a real difficulty reaching their audience. This translates either to the need for expensive commercial promotion, which often can not be justified, or into reaching only those who search carefully for your information. The latter means multiplication of the same information.
In writing this article, I see the effects mentioned will lead to changes in the future. As I write "attracting an audience initially enter into the publishing process", I think to myself this will obviously change. Attracting an audience will emerge in time as the primary step in publishing. There are many places to take this discussion, but my job is a researcher, or rather an internet-focused search theorist. (Long job titles will be in vogue). Let us focus on how these changes effect this internet as an information resource.
1) Any effort to organize the internet is diluted because of these efforts. 2) Any effort by the researcher to find different perspectives will be confounded by the number of people with the same perspective publishing in the same medium. 3) Certain fields are more heavily hit than others. Internet advice on what search engines to use is ubiquitous. Java Programming hints are numerous. More specialized topics (like internet-focused search theory) are less affected. 4) Viral marketing - a catchword for sure, hopes to achieve promotion by seeding many sites with information. Perhaps an innovative way around accepting the multiplication of sites delivering the same or similar information.
In phrasing the question you wish to answer, before the search, experienced researchers will focus on what information is likely to be available in numerous overlapping versions. These questions can be answered with the search tools that cover information in a more random manner: Search Engines do this very well. Tightly focused questions, less likely to be distributed so completely, should be approached with different tools: mailing lists and nexus points, long complex search queries and index points.
In conclusion, the internet will become far more cluttered than we had expected. I had previously predicted that search engines would grow to meet the needs, but this is not to be. Search engines will continue to serve up answers available from multiple places in the world. There is market enough in this, and minimal need to tackle anything more.
Getting the Best from the Internet. Section 11
A search for information on the internet is not essentially different from the standard information search process. You still need to start by outlining carefully just what you are hoping to locate. You also need to be aware of the peculiarities of the internet as a researchable resource (or rather a collection of resources). If you expect instant delivery of exactly what you require, free, then you need a reality check (and I am sure you will get one real soon). Sadly, the printed media tends to overlook this.
As with all resources, the more familiar you are with a given resource, the more efficiently you will work. Get to know the internet for a time first. Understand how it works. Then re-adjust your expectations and file it as just another collection of resources, perhaps preferable in certain circumstances.
A Structured Approach to Searching Much of this book has been devoted to describing what we could call a structural approach to finding information. We build a question, select a format and then search in an essentially static manner. There are only a few resources of interest for each format.
On the internet, we again do the same. If you want to search online periodicals (a specific format for information with specific qualities that might be appropriate) there are just a few sites to review. The search is simple and straightforward. Search then read then reassess if it helped answer your question.
The structured approach has been a simpler way to introduce a far more important application. Searchers know where answers are already - without ever having read the answer before - without having studied the topic. This is, after all, one of the few reasons to even consider paying for professional search assistance.
How does a searcher know where answers lie?
By building up a clear understanding of what information is out there, where it resides, and how to get to it, a searcher learns to anticipate the location of answers. Anticipation is everything.
- - - - - - - - - - - - - -
Know Where to Look Let's look at information itself. Information passes from producer, to organizer, to consumer. It travels many paths in this journey. Superficially, we can observe internet communication travels via email, newsgroups, and webpages (and others). Let's call these tools.
Looking deeper, we observe info |
|
|