Saturday 22 October 2011

DITA Assignment: Blog Post 1: Web 1.0

Andrew Hill   Student Number: 110053732
Digital Information Technologies and Architectures

Web 1.0

Introduction

The World Wide Web (www) and the Internet have had huge impacts on the way in which information is gathered, displayed and stored within public libraries, and on a wider scale the way in which we as a society communicate. In this essay I focus mainly on databases and information retrieval systems as they’re so central to my work and I think about their implications for public libraries.

World Wide Web and the Internet

The Web and the internet are often spoken of as if they were the same thing, something which isn’t the case. The Web consists of a huge amount of information which is located within millions of computers which are connected via the internet (Chowdhury & Chowdhury 2001). A good analogy highlighting their difference is to think of the internet as a road and the Web as the car which navigates this road (MacFarlane et al 2011). The internet was created in the 1960s by the American military as a form of electronic communication (www.about.com 2011). The Web was being developed by Tim Berners-Lee in 1989 at the European Laboratory for Particle Physics in Geneva (Chowdhury and Chowdhury 2001) It was created to make it easier for hyperlinked research documents to be accessed and shared between academics remotely through the internet (MacFarlane et al 2011).
The Internet and Web works through what’s called the ‘client-server’ model (see fig.1). The client machine sends messages to the servers which in turn listen for these messages and then reply back with digital messages which the client then interprets. The Web pages are contained within the servers and these pages are displayed within the clients (MacFarlane et al 2011).









Fig.1- Simple example of Client-Server Model



Source: www.4guysfromrolla.com (2011)

HTML

HTML has been defined as the “mother tongue” of web browsers (www.html.net 2011). It’s a language which enables information to be presented on the internet in a logical and coherent way. HTML stands for “hypertext mark-up language”. It’s important to note that HTML isn’t a programming language but a mark-up language (http://www.w3schools.com/ 2011). The term “tag” is used in reference to each individual mark-up code. HTML tags are part of a code which can be represented as either letters or words between angled brackets, or less than/greater than symbols (Chowdhury & Chowdhury 2001). For example:

<title>, <body>

To allow the programme to recognize that a sequence has finished, a closing tag is necessary (indicated by the forward slash):

</title>, </body>

Below are links to a page I created during lab session two and my blog:

http://www.student.city.ac.uk/~abkb654/index.html
http://citylibrarysciencema.blogspot.com/2011/10/3rd-session-11102011-digital.html

As can be seen from my HTML page, the implementation of hypertext (a text which contains links to other texts) allows for links to other pages on the web.

Databases and Information Retrieval (IR)    

For a public library, having access to organisational data in a centrally located database system is essential. In order for libraries to keep up with other information sources it’s imperative to have databases in place that enable both staff and customers to effectively locate required information in a simple and effective manner. Being able to store all necessary data centrally greatly reduces the likelihood of inconsistency and redundancy of information, something which wasn’t possible when data was stored on magnetic tapes and disks in master files. There is also the added advantage of greater security.
One of the ways in which the data stored makes any sense is through relational database management systems (DBMS) and Structure Query Language (SQL), first introduced by Dr Codd at IBM (www.IBM.com 2003). SQL is a structured variation of English and is used to talk to the DBMS to query what’s in the database and to modify information. During the lab sessions we queried database tables, something that involved implementing SQL. Below is an example of a set of commands to extract the required information of publishers based in New York:

SELECT: PubID, name, company_name, city

FROM: publishers

WHERE: city = “new york”;

The ; at the end of the command indicates the end of a search.

It’s obvious that the DBMS which I use in my role (Vubis-Smart) makes my daily work more efficient and greatly reduces the amount of time I’d spend retrieving information in a manual environment. The ability for the public to access the library catalogue and resources also improves the level of service our users can gain access to.
The aspect of SQL and DBMS I found most challenging was learning to think in a logical way which was representative of the language being used. When retrieving unstructured information when working in the library, search engines such as Google and Bing do all of the logical work for you, so it took a while to get used to this structured logical format and gave a good insight into the mechanics behind the seeming surface-simplicity of database retrieval systems. It’s important to note that IR systems are different to database systems. Database systems operate via structured data whereas IR systems work on unstructured data (www.wiki.answers.com 2011). Database retrieval results are objectively relevant whereas IR results are subjectively relevant (MacFarlane, 2011).The IR process goes as follows:

USER: the person using the technology.

SYSTEM: the technology with which to carry out the search, which mediates the communication between the USER and the SOURCES.

SOURCES: the web, online systems, etc. (Chowdhury 1999)

Information retrieval in public libraries has become much more efficient in terms of speed through the use of search engines such as Google. Of the main classes of web querying, Navigational (intent to navigate to a particular site) and Informational (intent to retrieve information believed to be stored on one or more web pages) retrieval are the two most regularly used in my job. In order to specify what I want from my search I can use certain Boolean operators such as AND, OR and NOT. AND helps to narrow a search, OR broadens a search, and NOT excludes certain records, for example: rock music NOT pop music. But with Google, which can be seen as Best Match, it’s not generally necessary to implement these operators.

Conclusion

The most interesting part of the module for me so far is seeing the frameworks which go in to the production of databases and IR systems. As a user you’re presented with an interface which is easy to navigate and sleek in appearance, whereas the mechanics behind the surface is far from what’s presented on screen, a bit like the production and the driving of a car; a huge amount of ingenuity goes into to making it easy for you to get from A to B. Understanding these basic mechanics which go in to the creation of the database and IR systems which I use on a daily basis has given me a much greater appreciation for how these systems work. With increasing emphasis put on digitalized collections, public libraries and the staff who work in them have to be able to implement these new technologies in order to keep up with the various other modern means of information retrieval if they’re to stand any chance of surviving as viable information sources in the twenty-first century.




References

Bellis, M. (2011) The History of the Internet [www.document]

Chowdhury, G.G. 1999. Introduction to Modern Information Retrieval. London: Library Association Publishing.

Chowdury, G.G. and Chowdhury, S. 2007. Organizing Information: From the Shelf to the Web. London: Facet Publishing.

MacFarlane, A. (2011). PRD1: Digital Information Technologies and Architectures, Lecture 4: Information Retrieval. [Class handout]. Retrieved from http://moodle.city.ac.uk/mod/resource/view.php?id=267314

MacFarlane, A., Butterworth, R and Dykes, J, (2011). PRD1: Digital Information Technologies and Architectures , Lecture 2: The Internet and the World Wide Web. [Class handout]. Retrieved from http://moodle.city.ac.uk/mod/resource/view.php?id=267300

No Author (2011) Client-Server Model [www.document]

No Author (2011) HTML Introduction [www.document]

No Author (2011) Database Systems and Information Retrieval Systems [www.document]

No Author (2003) Former IBM Fellow Edgar (Ted) Codd passed away on April 18 http://www.research.ibm.com/resources/news/20030423_edgarpassaway.shtml (accessed 18/10/2011)




Tuesday 11 October 2011

3rd Session 11/10/2011 Digital Information Technology Architecture

Databases

During this session we looked at ways in which to locate and manage data within a database. We briefly looked at how the introduction of computers in the work related area in the 1950s and 1960s meant the increasing need to be able to store and retrieve larger amounts of data.

Databases were defined as "an integrated collection of data shareable between users and application systems" (Butterworth, 2011). We looked at how the data stored within databases can be grouped and related in order for successful searches. For example, if you were looking for the name of a person/employee, just typing their name is unlikely to be enough information to locate the person in question; but because we can combine data searches (name, department, post code, etc.) we're able to narrow down the search and make locating their data more likely.
What is a Database?

A database is a collection of data tables. A data table is a two-dimensional table of data, made up of rows and columns. Databases should only capture the attributes needed and should discard any unnecessary information.

Designing a Database

The process of database design begins with what was termed in class as an 'Entity Relationship' model. The ER model then sets out the entities you need stored and retrieved within the database and the ways in which the specific information is related. We were told the ER modelling was too advanced for this course but that it was important to have a basic understanding of the rules that govern them.

Structured Query Language, or SQL

  • Occasionally pronounced 'sequence'
  • Been around since the late 1970s
The model created (SQL) was deemed so good that very little if no modernizing of it have been necessary. SQL is a language used for talking to database management systems and is in a sense a structured variation of English. It can be used to query whats in a database and to insert, modify or delete data from tables.

Database Solutions

Database solutions are ways in which to store all necessary data in a central database, decreasing the likelihood of any inconsistency & redundancy (that is, redundant data). It was mentioned how it's good when you own the data as you can then structure it your needs, make it homogeneous in structure and it's easy to ensure your searches are ran and completed smoothly.

Note: There was a lot more info but I'm not sure how much use it would be to (attempt) to elaborate any further.

Lab Session 3

We were first given some practical exercises (as copied from Moodle below):

Task 1 : Log on to your Unix Account

Please use your normal username/password to log on to your unix account as follows:




  • Login to the swindon computer, which a Unix machine operated by the Central Services. You will need to type your username and password to log in to swindon - for security reasons the password may not show up as you type :







    • From City Windows Labs - Run the Telnet utility Start > Programs > O to P > Putty 0.60 > Putty. Where it says Type in Host Name (or IP Address) , enter swindon.city.ac.uk and click the Open button to login. We recommend that you select the ssh option.
    • From off campus use SSH - Run the ssh program from your menu. Click the Quick Connect button and type in swindon.city.ac.uk where it says Host Name:. Click the Connect button to login. You should then login to vega using 'ssh vega'. SSH is available on the module software page.
    • At the command menu on unix, you will see the option [U] Unix shell: type 'u' to obtain the unix command line.

    Task 2 : Start a MySQL Database Session

    You now need to start a MySQL session. Please follow the instructions outlined in the task carefully so that you can run your queries.
    1. Once you have logged in to swindon and have the command line ready, run the MySQL application by using the command: mysql -p -u biblio -h vega.soi.city.ac.uk(note that you should be able to copy this commands and others from this window in your browser and paste them into the SSH window either by right clicking and selecting '>Paste' or by clicking the right mouse button in the SSH window)
    2. When prompted, use the following password to open the biblio database: b1bl10 (bravo-one-bravo-lima-one-zero)
    3. Load the biblio database by using the command: use biblio;
    This was just in order to get us into the right programme to carty out the tasks. From here we were asked to familiarise ourselves with the My SQL and 'biblio' Database.

    Task 3 : Familiarise yourself with MySQL and the 'biblio' Database

    Have a look at this diagram - its shows the relationship between tables in the 'biblio' database, and the data which is held in each table. Familiarise yourself with the information held and the relationships. You'll need to refer to this diagram to formulate your SQL queries.



    Biblio table relationships

    You can find out more about MySQL and the biblio database by using the following commands :
    • show tables; - show what tables are available in the biblio databases.
    • desc authors; - show details of one of the database tables (in this case authors).
    Also note that MySQL commands must be ended - usually with a semicolon ';'
    An alternative is to use \g or \G to format the output in different ways.
    Try issuing some SQL queries from the MySQL command line and ending the commands with :
    • \G - record by record view, e.g. :
              select year_published, title from titles where year_published < 1970 \G
    • \g - output table view, e.g. :
              select year_published, title from titles where year_published < 1970 \g
    Then came the 10 tasks, of which I got to number 6.

    Task 4 : Querying a Database

    Develop SQL queries to return following information :
    1. A list of the PubID, Name, Company Name and City for all publishers based in the city of New York
    2. A list of all fields for publishers named Prentice Hall.
    3. A list of the Title, Year and ISBN for all titles published in 1994.
    4. A list of the Title, Year, ISBN and PubID for all titles published since 1980 in year order
    5. A list of all fields in the Titles table for books whose title begins with the word 'database' (regardless of upper/lower case letters)
    6. A list of all fields in the Titles table for books whose title with the word 'database' anywhere in the title (regardless of upper/lower case letters)
    7. A list of the title, Year Published and ISBN for all books with 'SQL' in the title written since 1990 in date order
    8. A list of the Company Names of all publishers who have published books on programming since 1990
    9. The name of the publisher who published a book with ISBN 0-0280074-8-4
    10. The name of the author who wrote "A Beginner's Guide to Basic" listing also, the ISBN and name of this book.
    Below is a simple illustration of the procedures I followed to complete the first 5 tasks:

    Task 1:

    SELECT Pubid, name, company_name, city
    FROM publishers
    WHERE city = "new york"; (; is needed when wanted to end the command)

    Task 2:

    SELECT * (* stands for all or everything)
    FROM publishers
    WHERE company_name = "prentice hall";

    Task 3:

    SELECT title, year_published, isbn
    FROM titles
    WHERE year_published = "1994";

    Task 4:

    SELECT title, year_published, isbn, pubid
    FROM titles
    WHERE year_published > 1980
    order by year_published;

    Task 5:

    SELECT *
    FROM titles
    WHERE title like "database%"; (% is used (as Richard said) as a "wild card")

    Even though this was the toughest lab sessions so far, I also found it the most enjoyable.



    

    Wednesday 5 October 2011

    2nd Session 3/10/2011 Digital Information Technology Architecture

    Session 2:
    In this session we looked in greater depth at what the internet and World Wide Web are and how they function.  A lot of the lecture towards the end became a little complicated (for my tiny mind), but once the practical session had finished I definitely felt more confident in what was taught in the lecture. Phew.
    The Internet
    We first distinguished between the World Wide Web and the internet. An analogy was given whereby the internet was compared to a road and the World Wide Web to a car which is used to drive on the road (Butterworth, 2011). The internet was original created by the US military in the 1960’s. The internet’s main function is to allow ‘remote’ computers to communicate and share information. The internet had obviously had a huge impact on the way in which we communicate and share information with each other. Some people also believe it’s having huge effects on the way in which we think and how our brains operate.
    The World Wide Web
    The www was created by Tim Berners-Lee in the early 1990’s. It was originally intended for the sharing of information between academics, but due to the quality and efficiency of its design, it soon became clear that it could be used in a wider context. Many industries have been hugely affected by the web, none more so than the music industry, due to the easy methods available to users who want to share and download files.
    Servers and Clients
    The server-client function is the function that allows the internet and WWW to operate. The machines which provide services to other machines are servers (http://computer.howstuffworks.com/internet/basics/internet-infrastructure9.htm) . The machines which are used to connect to these servers are clients. Servers are programmed to ‘listen’ for and pick-up on any information being sent to them.  The server will then send back information to the client from where the message originally came from (Butterworth, 2011). The servers contain web pages and the clients are what decode and allow us to view the pages.
    Hypertext, Data Mark-up and Language
    We then moved onto discuss Hypertext, Data Mark-up and Language. Hypertext is basic terms is a form of natural language text that allows you to link with other areas of a document. This inking is what allows a user to view referenced materials and links. These documents of which are linked to the ‘original’ document can be in the form of media, music or text. The Hypertext Mark-up Language is what allows linking of documents to take place on such a large scale (Butterworth, 2011).
    Creating Webpages
    From here we looked at how to create a webpage and what goes in to their formulation, something I’ll try and summarise below in the ‘lab’ section.

    LAB SESSION 2
    This practical session was aimed at understanding how to produce and link some (very) simple linked hypertext pages and then convert them into web pages.

    The first task was to browse a website which gave advice and guidance on how to set up an HTML page. The one recommended to us was http://www.w3schools.com/html/.
    We were then asked to find out about three different tags, of hich I chose the first three on the list:
    1.       Paragraphs
     <p>
    2.       Line Breaks
     <br>
    3.        Horizontal rules
     <hr>
    We were then asked to play around with a simple HTML set up. The example given was:
    A Simple HTML Page With Hyperlink
    <HTML>
      <HEAD>
        <TITLE>A Simple HTML Page</TITLE>
      </HEAD>
      <BODY>
        A web page using HTML to produce
        a hyperlink to
        <a href="http://www.city.ac.uk/">
        City University</a>.
      </BODY>
    </HTML>
    I copy and pasted this simple example into Wordpad and saved the file as ‘first.html’. It was saved as HTML file to ensure that it was recognised in the Hypertext Mark-Up Format Language. I then opened the file in Internet Explorer to view it as an HTML file.

    From here I had to change the information which was contained with the example, adding things such as my name, my course title, a few links to various websites and an image (see link below for HTML version).

    http://www.student.city.ac.uk/~abkb654/index.html
    This is as far as I reached in this session (due to it only being for an hour) and will be completing it in next weeks class.