Monday, November 29

404 and HTTP What do all the codes mean?


A question that comes up often -- especially with regards site stats -- is "What do all the codes mean?

Web standards are governed by documents prepared by standards committees, approved, and then implemented world-wide. Following are notes extracted from RFC2068, the draft governing document for the hypertext protocol.

Message Number Categories


Because you usually only see one or two error numbers it's easy to believe that's all there are. Actually, there are families of numbers; and, not all are errors. We'll explore each family and its members below.

1xx Codes (
Information). There are a few official codes in the one hundred range. But, if you see one you have probably stumbled onto some sort of experimental application. In this case, what you see will be non-standard and could be most anything.
 
100 (Continue). An interim response telling the browser the initial part of its request has been received and not rejected by the server. A final response code should be sent when the remainder of the material has been sent.


101 (Switching Protocols). The browser may wish to change protocols it's using. If such a request is sent and approved by the server this response is given.


2xx Codes (Success). The two hundred range is reserved for successful responses. You probably won't see one of these codes, but your browser will receive them and know that whatever request was sent by the browser was received, understood, and accepted.

 
200 (OK). The request was successful and information was returned. This is, by far, the most common code returned on the web.

201 (Created). If a POST command is issued by a browser (usually in processing a form) then the 201 code is returned if
the resource requested to be created was actually created. If there is a delay in creating the resource the response should be 202, but may be 201 and contain a description of when it will be created.

202 (Accepted). If a request for processing was sent and accepted but not acted upon
and the delay in acting is unknown, then this code should be sent instead of 201. Note that 202 does not commit to processing the request; it only says the request was accepted. A pointer to some status monitor for the task is often included with this response so users can check back later. 


203 (Non-Authoritative Information). Usually the preliminary information sent from a server to a browser comes directly from the server. If it does not, then this code might also be sent to indicate that information did not come from a known source.


204 (No New Content). The request was accepted and filled but no new information is being sent back. The browser receiving this response should not change its screen display (although new, and changed, private header information may be sent).

205 (Reset Content). When you fill in a form and send the data, the server may send this code telling the browser that the data was received and the action carried out so the browser should now clear the form (or reset the display in some manner).

206 (Partial Content). This code indicates the server has only filled part of a specific type of request.
 
3xx (Redirection). The 3xx codes indicate some need for further action by your browser. User action may or may not be necessary to cause this further action to take place; often it will just happen automatically. There are safeguards built into the specification designed to prevent infinite loops, which can sometimes result from automatic redirection.


300 (Multiple Choice). You should not see 300 standing alone; it serves as a template for the following specific codes.

301 (Moved Permanently). As the name implies, the addressed resource has moved and all future requests for that resource should be made to a new URL. Sometimes there is an automatic transfer to the new location.

302 (Moved Temporarily). The addresses resource has moved, but future requests should continue to come to the original URL. Sometimes there is an automatic transfer to the new location.

303 (See Other). The response to your browser's request can be found elsewhere. Automatic redirection may take place to the new location.

304 (Not Modified). In order to save bandwidth your browser may make a conditional request for resources. The conditional request contains an "If-Modified-Since" field and if the resource has not changed since that date the server will simply return the 304 code and the browser will use its cached copy of the resource.

305 (Use Proxy). This is notice that a specific proxy server must be used to access the resource. The URL of the proxy should be provided.
 
4xx (Client Error). The 4xx codes are the ones you are most likely to actually see; particularly code 404.
These codes indicate some sort of error has happened.

400 (Bad Request). The server did not understand the request. This is usually cured by resending the request.

401 (Unauthorized). The request requires some form of authentication (e.g., userid and/or password) but did not contain it. Usually, this code results in a box popping up in your browser asking you for the required
information. Once you supply it the request is sent again.

402 (Payment Required). Reserved for future use. [Who says the web is not moving toward being a commercial medium!]

403 (Forbidden). This is a sort of catch-all refusal.If the server understood the request but, for whatever
reason, refuses to fill it, a code 403 will often be returned. The server may or may not explain why it is
sending a 403 response and there is not much you can do  about it.




404 (Not Found). If you happen to mistype a URL or enter an old one that no longer exists this is the
error you will likely see. The condition may be temporary or permanent but this information is rarely
provided. Sometimes code 403 is sent in place of 404.

405 (Method Not Allowed). Your browser has requested a resource using a procedure not allowed to obtain that resource. The response should contain allowed procedures.


406 (Not Acceptable). Your browser said only certain response types will be accepted and the server says the content requested does not fit those response types. (This is one way content monitoring can be implemented.)


407 (Proxy Authentication Required). This code is similar to 401, except that the browser must first authenticate itself.


408 (Request Timeout). Your browser waited too long and the server timed out. A new request must be sent.


409 (Conflict). If a site allows users to change resources and two users attempt to change the same

resource there is a conflict. In this, and other such situations, the server may return the 409 code and
should also return information necessary to help the user (or browser) resolve the conflict.

410 (Gone). Code 410 is more specific than 404 when a resource can't be found. If the server knows, for a

fact, that the resource is no longer available and no forwarding address is known, then 410 should be
returned. If the server does not have specific information about the resource, then 404 is returned.

411 (Length Required). For some processes a server needs to know exactly how long the content is. If the

browser does not supply the proper length code 411 may result.

412 (Precondition Failed). A browser can put conditions on a request. If the server evaluates those conditions and comes up with a false answer, the 412 code may be returned.


413 (Request Entity Too Large). If your browser makes a request that is longer than the server can process code 413 may be returned. Additionally, the server may even close the connection to prevent the request from being resubmitted (this does not mean a
phone connection will hang up; just that the browser's link to the site may be terminated and have to be started over again).

414 (Request-URI Too Long). You will likely not see this one as it is rare. But, if the resource address you've sent to the browser is too long this code will result. One of the reasons this code exists is to give

the server a response when the server is under attack by someone trying to exploit fixed-length buffers by
causing them to overflow. 


415 (Unsupported Media Type). If your browser makes a request using the wrong format, this code may result.
 
5xx (Server Error). The 5xx series of codes indicate cases where the server knows it has made an error or is
not capable of answering the request. In most cases the server should include some information explaining the
error and say if the situation is temporary or permanent.

500 (Internal Server Error). An unexpected condition prevented the server from filling the request.

501 (Not Implemented). The server is not designed (or does not have the software) to fill the request.

502 (Bad Gateway). When a server acts as a go-between it may receive an invalid request. This code is returned when that happens.

503 (Service Unavailable). This code is returned when  the server cannot respond due to temporary overloading or maintenance. Some users, for example, have limited accounts which can only handle so many requests per day or bytes send per period of time. When the limits are
exceeded a 503 code may be returned.

504 (Gateway Timeout). A gateway or proxy server timed out without responding.

505 (HTTP Version Not Supported). The browser has requested a specific transfer protocol version that is
not supported by the server. The server should return what protocols are supported.

What Can Webmasters Do?

Users get frustrated by error messages that don't really tell them anything. Even the descriptions above
for the various return codes don't say what you, the user, can do.

Webmasters can help. By analyzing their logs a webmaster can determine which error codes are being
returned to users. For the most common, more descriptive error messages can be generated and the
system told to use them. This latter is done using a file named ".htaccess" placed in the main directory for
the web site. [.htaccess is used for Web hosts using
UNIX or some UNIX offshoot.]
The .htaccess file can control many things, but to help with error messages the webmaster has only to insert
line(s) of the form (each of these should be on a line by itself starting with "ErrorDocument" but they may be
wrapped in this display):


ErrorDocument 402 "
 
ErrorDocument 403 /forbidden.html

ErrorDocument 404 http://cknow.com/notfound.html

Note that the ErrorDocument command can have raw HTML code (note the leading quote only; no ending quote), file references, or URL references. Use whichever is appropriate to help users when they encounter errors at your site. If nothing else, include a 404 ErrorDocument command to help those who mistype something. If you don't they may not come back!

If you want to really help (and keep the search engines happy), when you change your Web site layout consider adding "redirect" lines into the .htaccess file. These cause requests to specific files that have been moved to be automatically directed to their new location and gives feedback to the search engines that the URL has changed. There are two forms you can use:

0 comments:

Post a Comment

Subscribe to RSS Feed Follow me on Twitter!