Lecture - 12 world wide web - Part - Ii

We will be continuing our discussion on World Wide Web. Now if you recall in your last class we are talking about the http protocol. And we had mentioned that this http protocol provides the heart of the World Wide Web. All transactions that go over the internet today, whenever a client wants to contact a web server and the responses are sent back, they are done through the http protocol. So today we will be continuing our discussion from there. So World Wide Web part two is the topic of our discussion today. So we start with by looking at the requirements of a web server. For example if someone asks you to design a web server, so we try to address the issues that will be faced by you. As a designer what are things needed that needs to be taken care of? So let us look at the basic requirement of a web server first. Now of course the most basic of the requirements is that your web server must be able to accept http request because a web is based on the basis of http protocol. So the first point is absolutely mandatory. And for the majority of requests the request type will be either GET where you are possibly trying to fetch a web page or the head where you are trying to fetch the information about a page. And if you are supporting forms then possibly also POST. There are other http commands also you recall. But these three are the most basic there are PUT, DELETE this kind of commands are also there but these three are the most basic ones. Now if you want to have GET and POST facility for that you can submit a query string. Then in addition you will have to have the facility for handling so called server side scripts. Now server side scripts are nothing but programs these are executable which are residing on the server machine. This executable can be written in any languages. They may be written in some scripting language Pearl or may written in languages like C or Java. And whenever you submit a query string using GET or POST, these executables will GET executed. Of course which programs you are trying to GET executed that must be mentioned in GET or POST command. And after execution the outputs of this program will be sent back to the browser or client. Typically this will be in the form html page. So this is a simple requirement that a web server must confirm to. Now in addition there are few other things well when a typical web server is getting installed is being installed. You will find there are some typical conventions for the directory structures which are followed. For instance there is something called the root directory. So all http files and folders are created under this root directory. So as an example this root directory may be slash home slash httpd for some web server like apache this is the default web server root directory. Similarly there will be a root tree which will be located under the root that will be called the http home directory. Well home directory is something where all your web pages are located. So whenever you are putting in some web pages on the server it must be under the home directory. In contrast the root directory is something where all the files related to the web server will be stored. Some of them may be your actually web pages some may be other miscellaneous files also. So typically the home directory can be something as home httpd docs may be and optionally there can be another directory under the root under which all the so called common gateway interface or cgi script programs and other scripts can be stored. Well again as an example this home cgi bin. This is a typical name not for the files which is storing under the cgi bin. Normally server side execution is provided for them because the idea is that the files that are stored under the cgi bin directory they are meant to be executed. They will GET executed if you have a GET or POST command which is actually specifying a query sting and a script which is located under the cgi bin directory for example. So all these files must have the execute permission whenever they are specified the programs will start executing. So the directory structure looks something like this. So on the top we will be having http root under which all the http directories and folders would be created. There would be the home directory, cgi directory, there could be other directory some may be simple documentations some may be some other library files etcetera. And your web page or web site that you are designing that will be under this http home. So these web folders which are shown here they will contain the directory tree structure of your so called web site. This is how the files or the directory folders are located or organized. In addition there is something called a default web page which is supported by all web servers. There is some default web page that will get returned by the server. If you are not specifying the document explicitly. For example you consider a request like this, GET you are simply specifying a name of the site you are not specifying the name of any file or the path name of that file directory structure. So just as specifying www.xyz.com, what this command will actually mean is that the request will be sent to a web server of course and the web server what it will do? It will search in its own directory structure under the web root and you try to find out whether there is a file with that default name present in it. If it is found that default file name is returned to the requesting client and usually the default file name is index.htm or index.html. They refer to a html document. Now it is possible to change the name of this default file directory because there is something called server script configuration. There is a configuration file where a number of options are there. You can simply open that file and edit. So in that file there is one line which specifies - What is the name of your default page? So instead of index dot htm you can change it to for example default.htm so that will be your default page in that case. Now talking about the scripts, the first thing about scripts is that they are residing on the side of the server. They are server side scripts and as I mentioned these scripts they are nothing but a program. This program may be written in any language you want. So this actually refers to a file that will be executed by the server and the output of execution will be sent back to the client. This is how the scripts work, but how do the server know that which script to run and how to GET the input data and where sent back the output. Well there are two ways the server script may start executing in response to either a GET http command. In fact there is a question mark in the command line or a POST http command. Now we shall see this in detail. First let us look at the GET command with a question mark one example is shown here. So recall GET command with a question mark means, that whatever follows the question mark this will represent a so called query string. So there will be a number of name and value pairs separated by ampersand. For example roll is a name, 1234 is a value, sex is a name; m is the value. So there can be any number of such name value pairs you can put in separated by ampersand and whatever comes before this question mark, this does not refer to a web page. Rather this refers to an executable file or a script. This is how the client can specify which program to run on the server side. So we explicitly specify the name of the script. So here for instance you are referring to a file called xyz.pl which is residing under the cgi bin directory. This file needs to be executed and the output has to be sent back. Now if you look at what happens internally, the server has to identify this question mark in the query in the total command line. If it is a GET command and if it finds a question mark then it identifies that whatever file name is specified before that is actually a program which is to be executed. Now it starts executing the xyz., this is not com this is xyz.pl. It starts executing the xyz.pl program. Now the program is written n such a way that it can read in the values of this name value pairs which was supplied as part of the parameters. Now how this values can be read in this we shall be discussing later not now. And again the output generated by xyz.pl program that will be sent back to the client. And as I said typically the output is generated in the form of html page so that when it is sent back to the browser the browser can display it in a suitably formatted way on the screen. Now if it is a POST command, it is in a way similar to GET, but the differ4ence is that, the name value priors which needs to be taken as the input to the program and not present on the command line. Rather they are present as data following one or more header lines and a blank line. So there will be a number of header lines after the header line there will be a blank line; after the blank line all the name value pairs will appear. This is the format in which POST sends the data to the web server. But in POST again the name of the file that you specify that will be the executable program. And here there is one difference from GET is that you do not limit the number of such name value pairs or the total size of the string that you can send. And here again the executable program which runs can read in the data values. But how this again we shall be talking about later. Because we know that the program has to execute it we have explicitly named the program as part of the command. But when the program executes, how to get the values of those name value pairs. This we shall see later when we talk more about the cgi scripts and the way it works and how they are actually written. So an example of the POST command is shown here. So the first line is the actual POST command which specifies out here the name of the cgi script program. The name of the executable. Here the name is myscript cgi. This is the name of the executable file then there are some header lines followed by a blank line out here, then the name value pairs. Now this name value pairs can appear in as many lines as you want. Not necessarily they will have to appear in a single line you can break it up in to a number of lines if you want. Now there are a few points which you need to remember. First thing is that the executable program that we are talking about. The so called cgi script, cgi stands for Common Gateway Interface. Now these executable programs can be potentially written in any language of your choice. It can be written in some shell script. This can be c shell, bourne shell, cone shell or any other shell. It can be written as a Perl. Perl is a popular choice for many. It can be written using some scripting languages asp or php. These are also quite popular. They may be written in conventional programming languages like C or Java. But one thing you remember in whatever language you write the program, you must ensure one thing. That the program when it is executing on the server side it must have some support for execution for example. If it is a c program then there has to be a c compiler to compile that first. If it is a java program, there has to be a java byte interpreter, it is a Perl program, there has to be a Perl interpreter. So in this way so to run a script written in a particular language, support for that particular language should exist on the side of the server. This is what you have to remember. For example asp a program written in asp it can be run under the internet information service, IIS which is available under the windows. But you cannot run it directly run it under the apache web server because IIS by default supports asp. It can interpret asp commands directly, that is why when you have a program written in asp you have to run it through IIS. Now let us very quickly look at what is a proxy server? And why we use it for what are the facilities that a proxy server can provide? Now a proxy server we had mentioned before that it acts as a intermediating between a client and a server. Typically in the internet scenario the client will be a web browser like Netscape, like internet explorer. The web server can be some site from which you are trying to access the page. This web server is some times also called the origin server that is the origins of the document on the resource you are you are trying to access. Now in this scenario you are having the access through the proxy server which is intercepting all your requests. So whenever you want to send a request to the outside world you request will be first sent to the proxy server. The proxy server will be sending the request on your behalf to the outside world. This is how the proxy server works. But let us see how? The first thing as I said, that it acts on behalf of other clients and presents requests from other clients to a server. The clients can be on the one side of it and the server is located on the other side of the proxy server. Now depending on which mode it is in, whether it is receiving a command from a client or sending a request to an external server, it can sometimes act as a web server, sometimes act as a web client. How? See the proxy server whenever it is receiving a command from a client then it acts as a web server to the client as if the client is sending a request to the web server. So the proxy server accepts the request as if it is a web server. But it is not processing that request directly and locally rather it is forwarding the request to some origin server or web server which is outside the network. Now while making the second part of the request the proxy server acts as the client and the server located outside that acts as the http server. So a proxy server has a dual role. Sometimes it is a client, sometime it is a server. And the most commonly used proxy server is a program called squid which is freely available and it can be installed in almost all platforms. Diagrammatically a proxy server looks like this. Typically these user agents which are shown on this side, this part belong to a private network. This is a private network say N. These are three computers which are located inside the private network. These user agents may be nothing but may be these are simple browsers. There is some commands which have been typed on the browser and they are possibly referring to some origin server may be this is yahoo.com. There can be one origin server there can be several origin servers. But I am showing just one origin server in this diagram just to illustrate. So the user agents will be sending the request to the proxy. So request will be coming like this. The proxy will be forwarding the request to the origin server. So the origin server after processing the request will be sending back the request to the proxy. The proxy in turn will be sending back the request to the original request user agent. This is how it works. Now in addition to this simple request forwarding and response receiving a proxy server typically also contains some access rules and cache. This we shall be talking about very shortly. These are some additional information which is maintained inside a proxy server. So talking about the functions of a proxy. First one we have already mentioned. It forwards request. See when it is forwarding a request it also acts as a simple and rudimentary fire wall. It can allow some requests, it can deny some requests. When someone from outside is trying to access an internal node your proxy can stop it. So these kinds of capabilities can be programmed in the proxy. So it can act as a simple firewall and you can have access control to a proxy. Access control means you can allow or deny certain access based on contents, based on location. Now when you say based on contents it may be based on some string which appear in the documents which we are trying to fetch. May be some string patterns in the web site name which you are trying to access. Depending on the policy of the installation, you can set some rules that some body should not be able to access this kind of sites or these kinds of contents. Typically these are specified by mentioning a list of strings which are you can say, only access size which contains the strings or access the size which does not which does not contain the strings. So it can be either way you are allowing or denying based on some contents. Similarly you can also control access based on some location. You can specify that these are the websites that I do not want anybody to access from inside. So these are some websites which are blocked. This kind of things can be done very easily. The third thing is that a proxy maintains http cache. What the cache means is that, see http. Http requests are coming to the proxy. Proxy is forwarding the request and is getting back the requested information may be the web pages from the servers. These pages are forwarded back to the requesting client. But what the proxy does is that it also maintains or keeps a copy of these pages in its local disk. There is a disk area which called a cache where these are maintained are stored the idea is that if some client requests for a page which is already there in the cache. It need not send out a request outside and GET that page again. It can be directly forwarded from the local cache it of course saves bandwidth and allows for faster access. Now let us talk about a very important device which is used in the internet. This is called a network address translator or a NAT. Now the reason why we use the NAT or manifold well in the simplest scenario a NAT allows a single device. This single device can be a router it can be a dedicated box. See this router can act as a NAT. This dedicated box can also act as a NAT. These are available commercially. So what we are saying is that, we are allowing this device to act as an intermediate agent between the internet and a local network. Now the local network we are calling or we are referring to the private network and the internet or the external network we are referring to the public network. Now NAT sits between the public network and private network. So you may argue that well it sounds very similar to a proxy server but well there are some differences. This will be clear as we go in to the details in the working of the NAT. See NAT does not only regulate access it also manages IP addresses see one bit problem many of them face today is that suppose if we have a organization which contains say 1000 computers, but we do not have 1000 valid IP addresses with us which we can assign to these computers rather the internet service provider through which we have obtained the internet connection, they give us only a set of few addresses. So how we can manage with this few addresses NAT is one such solution. So NAT tries to address the IP address distribution problem the way NAT is specified in to works is specified in a RFC document. If you are interested you can have a look at it 1631 is the number of the document. Now if you are using NAT then potentially one single unique IP address is sufficient to provide connectivity to an entire group of computers. Of course there are several variations which are possible. But in general even if you have a single IP address with you which is a registered IP address that may be sufficient for your organization requirements to have access to the outside world. But there is some restrictions and we shall see later that not all kinds of access are all. May be accesses from inside the network to outside will be allowed, but not the reverse. Someone from outside your network may not be able to directly connect to a computer inside. So as I mentioned NAT resides between the private network and the public network which is the internet. Now NAT is available as a separate box from the network vendors. Most of the routers which also connect a private network or the public network. They also have the capability to act as a NAT. So NAT can be embedded inside a router or you can have a separate box that can act as a NAT. Now let us see what are the various forms of NAT network address translation. Static NAT. Well, as the name implies we are trying to provide a static address translation. So what you are trying to do is that you are trying to provide a mapping of an unregistered IP address to a registered IP address. You remember this is a one to one mapping which means that if you have n number of unregistered IP addresses on n machines then to provide static NAT. You will be requiring N registered addresses. What this means is that say internally. Suppose you have ten computers and you have 10 register’s IP addresses with you. NAT will provide a correspondence between the internal computers and these IP addresses so that with this IP addresses they can access the outside world. Now static NAT is almost similar to providing these addresses directly to the computers. So it is not much useful other than some very specific needs. Why do want to assign some fixed IP address to some of the computers inside? Like for example inside your network, there is one computer which is the web server and you want the outside users. That means users are people who are residing outside your network they should also be able to access your web server. So your web server must have a statically assigned IP address. But for others you may not need this. For others you have alternatives. One alternative is to go for dynamic network address translation dynamic NAT. Dynamic NAT also provides or maps an unregistered IP address to a registered IP address. But the difference from static NAT is that here you do not have one to one mapping. Rather here you are getting this registered IP address from a given pool of registered IP addresses. What I mean to say is here is something like this. Suppose I have with me ten registered IP address suppose I am the NAT I have ten registered IP address with me and the on the other side in the public network there are 100 computers. So whenever I get a request from one of the computers in the private network I assign one of these 10 addresses to that computer and as long as the request is being processed that address will be statically assigned to that computer. But once the request processing is over that address will again be de allocated and we will be returning back to my pool. So with this scheme I can have 10 simultaneous access connections at the same time. But we are not limiting the total number of computers that can possibly have connections with the outside world. In fact 100 or even 1000s if there are 1000 computers they can have access but not more than ten at a time. I have only ten valid addresses with me and I am allocating them based on demand. So these addresses are assigned dynamically you can have any number of computers. But you will be having a limit N depending on the number of addresses you have to the number that can communicate at a given time. There is another form of NAT which is perhaps the most popular. This is called overloading. This in a sense is a special form of dynamic NAT because addresses are not assigned statically to the computers. They are somehow access permissions are generated dynamically. This is again is used to map multiple unregistered IP addresses to a single registered IP address. See here there is a difference you have a single registered address not multiple as discussing in the previous case. There is a single valid IP address available with me. But I can support 100 simultaneous requests from the private network. How I can do this? I can do this by using the port numbers. So here I am using port numbers the different requests will be using different port numbers. Because we are using ports to distinguish the requesting computers. This method is also called port address translation or PAT in short. So effectively in this scheme. Each computer in the private network will get translated to the same IP address. Only difference is that they will be having a different port number. So with this kind of a scheme you are no longer limited by the number of IP addresses you have. You can have as many simultaneous connections you desire using this scheme. These are said this is widely used. So let us look at NAT overloading in some detail. So as I said using multiple port numbers you can support several simultaneous connections. This is what is called multiplexing in the transport layer level of the TCP/IP protocol stack. Multiplexing means that a computer maintains several concurrent connections with a single remote computer. But several connections can be maintained by using different port numbers. Like for example I can have a connection with a machine x. But effectively I can have 100 connections with x. If I use 100 different port numbers, now each of this 100 connection may refer to some different server programs or may be the same server program. There are multiple requests coming to it. I can use different port numbers to distinguish. Now this recalls that the IP packet header well including the extension header. This will contain the source and destination IP addresses and the port numbers. So the combination of these four elements will define a complete connection. So even if this source IP address and destination IP address are the same. If we can vary say for example the source port number, then these four tuples will still remain unique. May be the other three tuples have the same values. But at least one of the values of the four things must change between the different connections. This is how we can ensure that multiple connections can be established. Now, some notations we define something called stub domain. Stub domain means the private network. So the domain which is behind the NAT which is a part of the private organization network, that is your stub domain. And for NAT for this address translation, we must maintain a table which is called address translation table or in short ATT. This ATT will have to be maintained by the router or the network address translator while it is carrying out the address mapping and also the port mapping. Well, let us look at the issue one by one. Suppose we want to implement simple dynamic NAT no overloading for the time being. Simple dynamic NAT means I have a pool of IP Addresses. If there are n addresses I can support n simultaneous connections. So far implementation of dynamic NAT, let us try to see what should the ATT contain? The ATT need only contain the IP addresses of the source and the IP address which has been allocated. Like you try to understand some computer had sent a request to the NAT. That computer, suppose had an address 10.5.6.7 say. The NAT assigns a valid address to it and in the table makes an entry that 10.5.6.7 has been assigned an address this. So for all the requests which are coming there will be an entry maintained in this table. Now what this table will allow it will allow outgoing connections. Of course, but it will also allow in coming connections. Why? Say as long as an entry like this, remains in the table you can have a connection from the outside world coming to you. If you see that request is destined to an address, which is called 10.5.6.7, then you can possibly forward that request to the destination machine. Because you know that this particular address is there. But there is a problem if you are saying that the request is coming with 10. Address, which you may recall that refers to a private address. May be the external routers will be discarding them. So the packet may not reach the NAT at all. So what you can do instead of advertising this private address to the outside world. You tell the outside world, that well my NAT has given me this IP address and has assigned me this. So you please connect me through this valid address. May be there is a valid IP address 2 200 3.10.5.17 or something like that. The outside agent will be using that IP address. The ATT will provide the translation and will be forwarding the request to the internal machine internal host. So using dynamic address mapping you can have some of the entries in the table fixed statically. Say for the web servers I give an example where you need a permanent address which is known to everybody. But the other addresses can be dynamically assigned and changed during the duration of a connection only. So in this way you can have the best of both worlds you can have some addresses which you need to be accessed from outside. And with the other IP addresses available you can provide dynamic access from the internal nodes. But you cannot have the other around a note from outside a host from outside cannot directly access an internal node. If it is does not have statically assigned IP address stored in the table. Now in this scheme port numbers are not needed. But when you talk about NAT overloading you require port numbers. The scenario is like this. The internal as I said it possibly has non routable IP addresses or private IP addresses which if it appears as a destination address in a packet the router will ignore it. Router cannot forward a packet to an address which is a private address it will simply ignore it. NAT enabled routers or a NAT box which contains a registered IP address which will be assigned by the internet authorized access provider address provider IANA. So this one single IP address will be sufficient. Now suppose the scenario is like this. An internal host, say x tries to connect to an outside web server. So the request will first reach router. The router will receive this request packet from x. The router will now save the IP address of the requesting computer. This is a private IP address, please recall. And the port number which the computer was using for sending its request. This can be obtained from the packet that was received from x, it saves this two into the table address translation table. And then the packet it will replace the IP address which was the private address with the router’s IP address. Replace the port number by the port number if the entry was already in the table or if it is a new connection you are trying to make to generate a unique port number. So you are basically modifying a packet by changing the destination IP address and the destination port number. Some internal computer was sending the packet to the router of the NAT. The NAT changes the address of the package to some address of the outside world also the address of the source is changed because the request has to come back. Now if the source address still is the private address the request cannot come back. So it changes the source address by the valid IP address that the router contains and also port number is a unique number that it generates automatically. So, on the other hand when the packet comes back the destination port number of the packet is used to search the ATT. So destination port number as I mentioned was assigned automatically and uniquely. So, that port number will uniquely identify an entry in the table that you can say that will acts as a primary keys of that ATT. So from that table you can obtain the source and port numbers source address and port numbers. You can accordingly change the source addresses of the IP packet and finally you can forward the packet to your to the host which was there in the internal network. So finally the packet will GET forwarded to the computer from which the request has originated. So a typical table this this address translation table may look like this. Well the first column is not really necessary but for documentation process it is sometimes kept. Source computer names are stored here. When a request comes the source computer addresses are stored and also the port number which the source computer was using. The NAT IP address there is only one address. So this will be the same and the NAT generates unique port numbers for every outgoing packet. So whenever an incoming packet comes it will come with this particular address as the destination address. But this port number will be the distinguishing factor. So whenever it is coming to port number 2, then through this table look up NAT will change this address to this particular address. And port number 2 to port number 75 and will be forwarding the packet to the internal network. This is how the stable the NAT can provide address translation dynamically. But one thing you remember, this overloading although it appears to be a very powerful technique, but the only problem is that you cannot allow a permanent address through which the outside host can access. Accessing means one of your internal servers. Suppose you have one web server, you want to have a public IP address out level to it. Through NAT you cannot do that because all requests are processed dynamically by assigning a port number to it. So if you want to have this kind of facility also then you have to mix static NAT and this kind of overloading where some entries in the table will be permanently fixed. Some entries will be assigned port numbers and processed dynamically. Talking about maximum number of concurrent translations that a NAT will provide. This is primarily determined by the memory size. The size of the memory that we use to store this table. Just to show a simple calculation, a typical entry in the ATT may take around 160 bits. So if we have 8 megabytes of memory available with you, then for every request you will be needing our 20 bytes. So if you make a calculation these are the total number of bits divide by 160. This will come to more than 4 lakh concurrent translations. Which means you can have so many entries in the table potentially and port number will never be a constraint for you because port number is a 16 bit number. You can have up to 65000 distinct port numbers. So the number of distinct port numbers you can have that will ultimately provide the upper limit. And talking about which addresses to use in the private network. There are some private address classes which also had mentioned before. Let us very quickly brush through it. This has been set aside by the address assignment agency as non-routable. Means these are unregistered routers will discard these addresses. If they are used as destination which means that a packet from a host within a private address can reach a registered host. But not the reverse because if you are trying to have the reverse connection, then the address of that private network has to be there in the source part which the routers will ignore. So from the private network you can reach an outside host but not the reverse. In order to achieve this you have to have a proxy server or a NAT sitting in between. The private address class is just to brush up. There is one class A address 10 dot. There are 16 class B addresses 172.16, through 172.31, there are 256 class C addresses 192 68 0 up to 255. So many addresses are available to be used as private addresses depending on the size of your organization and requirements. You can possibly select one of these and use a proxy server or a NAT in conjunction with this to provide access to the outside world. Now other benefits of NAT. Well NAT automatically creates some kind of a firewall between the internal external networks. Why? Because NAT only allows connections that has originated from inside, an outside host just as I mentioned cannot directly establish a connection with an internal host because an internal host is having a private address and an outside host cannot directly specify that public, that particular private address to initiate a connection. So for cases where you need in bound mapping as I mentioned just for web server. For example you need static address assignment in those special cases. So in general you can have a combination of static and dynamic address mapping to have this flexibility. Some address static less signs some address dynamic. A question which sometimes arises that is NAT and proxy server are the same well technically speaking. Although they work in a similar way they are not the same. Why? The main reason is that NAT is transparent to both source and destination hosts. Neither the source nor the destination need to know that a NAT is present. But rather, in case of a proxy server. For example if you are accessing the internet through the proxy server you have to explicitly mention the name of the proxy server and the port number it is using in the configuration for your browser. That is something which is not transparent which you must explicitly specify that this is the proxy server I want to use. And the another difference is that NAT works in the network layer and a proxy server works at the transport layer or above. So with this we come to the end of today’s lecture. We shall be now looking at the solutions that where POST for the problems of the last lecture and we shall also be presenting some quizzes for today. So let us first look at the solutions to the quiz question on our previous lecture. The first question was , Why is the traditional http protocol called stateless? This however mentioned that it is stateless because the http server immediately closes the connection after completing a transaction and no history is maintained. But again you recall we mentioned that you can have a option either close or keep alive where you can specify whether you want these kind of stateless transaction or connection as default. Or you want to have a persistent connection or the connection will remain active over successive transactions or requests. What is a hypertext? Hypertext is a text which contains a some documents or text of course it also contains links to other texts. These links to other links are called hypertext. Now in the context of the World Wide Web the pages that we see on the browser they are mostly hypertext. They are usually typically written in html. There are other technologies also but html is most popular. What is the default port number of http? Which is port number 80. What does the client request to a http server comprise of? It consists of several things: request method, path portion of the URL and the version number. They must appear on the same line. Depending on the command, depending on the request method you may have optional request headers. If you have an optional request header then there will be a blank line followed by some additional data if you want. For POST and put methods you need this additional data. So this all this taken to gather are the components of a client request. How can the GET command be used to submit forms? Well this again we have mentioned today by including a question mark after the path name and including a query string after that question mark. What is the purpose of head command? To return the header information of the specified document because in many cases the client does need to have the full document, rather the header information. This may be used for a number of a I mean application cases. There is one application I can recite for instance in case of web search engines, when they try to maintain and update their database they simply look at the header rather than the complete document. So the header should contain some information which may be useful for the search engine to update its database. In what way is POST different from GET, when data is being sent to a common gateway interface script? Well if you recall in POST we do not send data as part of the header line. Rather there will be the header lines, then there will be a blank line as the delimiter following the blank line we specify the query string. Now in case of POST the size of the query string can be larger than GET. For GET, it is limited to the maximum size of this string that the machine where the server is running supports. Typically it is 256 characters. How are the data sent in POST command ? Well after the blank line following the header lines. What does the connection field in the http request header signify? Well it specifies whether you want to have a stateless kind of communication or so called stateful whether you want to close the connection immediately after transaction or whether you want to remain or retain the connection across several transactions. What does a typical http response consists of? Response is similar to request. Only difference is that the initial request header is not there. But in place of it there is an initial response line. Following the response line there will be some header lines. The blank lines followed by the requested data. So the remaining part is quite similar to a request the initial response line is only different. What are the basic differences in the http “1.1” version from “1.0”? Well the differences are host identification default support from persistent or keep alive connection, content negotiation, request a part of the document caching, these are the additional features which the version “1.1” supports. How does a proxy server act both as a client and a server? Well it is a server when it is receiving a request from a internal client it is a client when it is forwarding the request to an external origin server. What is the URL syntax for FTP? For ftp it will be starting with ftp, then it will be specifying the user name and password followed by colon. Then there is the at the rate sign symbol followed by the path name of the document you are going to transfer. Now the questions from today’s lecture. Well which http commands can result in the execution of server side scripts? What are the differences between root directory and home directory in a web server installation? What are the main purposes behind the use of proxy servers? Name two web servers and one proxy server that are widely used. In static NAT on what factors will the number of registered IP addresses depend on? For NAT overloading what are the typical entries in the fields of the address translation table? Which of the fields would not be required if it is only required to implement dynamic NAT? Can a machine with a private IP address communicate with a public host in the outside world? So these are questions from today’s lecture. In our next lecture we shall be starting on our next module on web page design. We shall be starting with discussion on the hypertext mark up language html which is so popularly used for designing the web pages. That we shall see on our next class. Thank you. Today we shall start our discussion on html which is the de facto language for designing web pages. Although today we have several other alternatives available with us but still html remains one of the most popular choices when it comes to the design of web pages. So today actually we shall be starting with the basic structure of an html document. What are the different things? What are the different so called tags and attributes that a typical html file contain? And we shall in our subsequent lecture what are the other features that you can support as part of html. So the first thing is that html. The full form is Hyper Text Markup Language. So there are two components to this name one is Hyper Text; other is Markup. Well Hyper Text we had already talked about earlier. Hypertext is a kind of textual document where you can have links to other documents. In html this kind of links are allowed. So in that sense html is a hypertext document.