foreveroverhead: DITA Assignment: Blog Post 1: Web 1.0

Andrew Hill Student Number: 110053732

Digital Information Technologies and Architectures

Web 1.0

Introduction

The World Wide Web (www) and the Internet have had huge impacts on the way in which information is gathered, displayed and stored within public libraries, and on a wider scale the way in which we as a society communicate. In this essay I focus mainly on databases and information retrieval systems as they’re so central to my work and I think about their implications for public libraries.

World Wide Web and the Internet

The Web and the internet are often spoken of as if they were the same thing, something which isn’t the case. The Web consists of a huge amount of information which is located within millions of computers which are connected via the internet (Chowdhury & Chowdhury 2001). A good analogy highlighting their difference is to think of the internet as a road and the Web as the car which navigates this road (MacFarlane et al 2011). The internet was created in the 1960s by the American military as a form of electronic communication (www.about.com 2011). The Web was being developed by Tim Berners-Lee in 1989 at the European Laboratory for Particle Physics in Geneva (Chowdhury and Chowdhury 2001) It was created to make it easier for hyperlinked research documents to be accessed and shared between academics remotely through the internet (MacFarlane et al 2011).

The Internet and Web works through what’s called the ‘client-server’ model (see fig.1). The client machine sends messages to the servers which in turn listen for these messages and then reply back with digital messages which the client then interprets. The Web pages are contained within the servers and these pages are displayed within the clients (MacFarlane et al 2011).

Fig.1- Simple example of Client-Server Model

Source: www.4guysfromrolla.com (2011)

HTML

HTML has been defined as the “mother tongue” of web browsers (www.html.net 2011). It’s a language which enables information to be presented on the internet in a logical and coherent way. HTML stands for “hypertext mark-up language”. It’s important to note that HTML isn’t a programming language but a mark-up language (http://www.w3schools.com/ 2011). The term “tag” is used in reference to each individual mark-up code. HTML tags are part of a code which can be represented as either letters or words between angled brackets, or less than/greater than symbols (Chowdhury & Chowdhury 2001). For example:

To allow the programme to recognize that a sequence has finished, a closing tag is necessary (indicated by the forward slash):

</title>, </body>

Below are links to a page I created during lab session two and my blog:

http://www.student.city.ac.uk/~abkb654/index.html

http://citylibrarysciencema.blogspot.com/2011/10/3rd-session-11102011-digital.html

As can be seen from my HTML page, the implementation of hypertext (a text which contains links to other texts) allows for links to other pages on the web.

Databases and Information Retrieval (IR)

For a public library, having access to organisational data in a centrally located database system is essential. In order for libraries to keep up with other information sources it’s imperative to have databases in place that enable both staff and customers to effectively locate required information in a simple and effective manner. Being able to store all necessary data centrally greatly reduces the likelihood of inconsistency and redundancy of information, something which wasn’t possible when data was stored on magnetic tapes and disks in master files. There is also the added advantage of greater security.

One of the ways in which the data stored makes any sense is through relational database management systems (DBMS) and Structure Query Language (SQL), first introduced by Dr Codd at IBM (www.IBM.com 2003). SQL is a structured variation of English and is used to talk to the DBMS to query what’s in the database and to modify information. During the lab sessions we queried database tables, something that involved implementing SQL. Below is an example of a set of commands to extract the required information of publishers based in New York:

SELECT: PubID, name, company_name, city

FROM: publishers

WHERE: city = “new york”;

The ; at the end of the command indicates the end of a search.

It’s obvious that the DBMS which I use in my role (Vubis-Smart) makes my daily work more efficient and greatly reduces the amount of time I’d spend retrieving information in a manual environment. The ability for the public to access the library catalogue and resources also improves the level of service our users can gain access to.

The aspect of SQL and DBMS I found most challenging was learning to think in a logical way which was representative of the language being used. When retrieving unstructured information when working in the library, search engines such as Google and Bing do all of the logical work for you, so it took a while to get used to this structured logical format and gave a good insight into the mechanics behind the seeming surface-simplicity of database retrieval systems. It’s important to note that IR systems are different to database systems. Database systems operate via structured data whereas IR systems work on unstructured data (www.wiki.answers.com 2011). Database retrieval results are objectively relevant whereas IR results are subjectively relevant (MacFarlane, 2011).The IR process goes as follows:

USER: the person using the technology.

SYSTEM: the technology with which to carry out the search, which mediates the communication between the USER and the SOURCES.

SOURCES: the web, online systems, etc. (Chowdhury 1999)

Information retrieval in public libraries has become much more efficient in terms of speed through the use of search engines such as Google. Of the main classes of web querying, Navigational (intent to navigate to a particular site) and Informational (intent to retrieve information believed to be stored on one or more web pages) retrieval are the two most regularly used in my job. In order to specify what I want from my search I can use certain Boolean operators such as AND, OR and NOT. AND helps to narrow a search, OR broadens a search, and NOT excludes certain records, for example: rock music NOT pop music. But with Google, which can be seen as Best Match, it’s not generally necessary to implement these operators.

Conclusion

The most interesting part of the module for me so far is seeing the frameworks which go in to the production of databases and IR systems. As a user you’re presented with an interface which is easy to navigate and sleek in appearance, whereas the mechanics behind the surface is far from what’s presented on screen, a bit like the production and the driving of a car; a huge amount of ingenuity goes into to making it easy for you to get from A to B. Understanding these basic mechanics which go in to the creation of the database and IR systems which I use on a daily basis has given me a much greater appreciation for how these systems work. With increasing emphasis put on digitalized collections, public libraries and the staff who work in them have to be able to implement these new technologies in order to keep up with the various other modern means of information retrieval if they’re to stand any chance of surviving as viable information sources in the twenty-first century.

References

Bellis, M. (2011) The History of the Internet [www.document]

http://inventors.about.com/od/istartinventions/a/internet.htm (accessed 18/10/2011)

Chowdhury, G.G. 1999. Introduction to Modern Information Retrieval. London: Library Association Publishing.

Chowdury, G.G. and Chowdhury, S. 2007. Organizing Information: From the Shelf to the Web. London: Facet Publishing.

MacFarlane, A. (2011). PRD1: Digital Information Technologies and Architectures, Lecture 4: Information Retrieval. [Class handout]. Retrieved from http://moodle.city.ac.uk/mod/resource/view.php?id=267314

MacFarlane, A., Butterworth, R and Dykes, J, (2011). PRD1: Digital Information Technologies and Architectures , Lecture 2: The Internet and the World Wide Web. [Class handout]. Retrieved from http://moodle.city.ac.uk/mod/resource/view.php?id=267300

No Author (2011) Client-Server Model [www.document]

http://www.4guysfromrolla.com/ASPScripts/PrintPage.asp?REF=%2Fwebtech%2FTheBook%2Fpage1.asp (accessed 18/10/2011)

No Author (2011) HTML Introduction [www.document]

http://www.w3schools.com/html/html_intro.asp (accessed 15/10/2011)

No Author (2011) Database Systems and Information Retrieval Systems [www.document]

http://wiki.answers.com/Q/Discuss_the_differences_between_database_system_and_information_retrieval_system (accessed 20/10/2011)

No Author (2003) Former IBM Fellow Edgar (Ted) Codd passed away on April 18 http://www.research.ibm.com/resources/news/20030423_edgarpassaway.shtml (accessed 18/10/2011)