D. J. Bernstein
Internet publication
FTP: File Transfer Protocol

How a browser retrieves a file

Here is how a typical FTP connection works.

A browser wants to retrieve an FTP URL such as

     ftp://ftp.heaven.af.mil/pub/report
This URL refers to the binary file or directory with username anonymous and pathname /pub/report provided by the FTP server on the host ftp.heaven.af.mil.

The browser looks up the IP address of ftp.heaven.af.mil, which happens to be 10.1.2.3. The browser connects to TCP port 21 on 10.1.2.3 and waits for the server's greeting:

     220 ftp.heaven.af.mil ready.
If the server does not accept the connection, the browser quits.

The browser then sends a USER request with an anonymous parameter, and waits for the response:

     USER anonymous
     331 Please identify yourself in a password.
If the response uses code beginning with a 3, the browser sends a PASS request, and waits for a response:
     PASS simplebrowser@
     230 Thanks.
The browser then sends a TYPE request with an I parameter, and waits for a response:
     TYPE I
     250 Okay.
The browser then sends a PASV request, and waits for a response:
     PASV
     227 =10,1,2,3,10,5
If the server does not accept the PASV request, the browser quits.

The browser then makes a separate TCP connection to 10.1.2.3, at a TCP port number determined by the PASV response. If this connection fails, the browser quits.

The browser then sends a RETR request giving the encoded pathname of the desired file, and waits for a response. There are two possibilities at this point.

First possibility. The response to RETR is a mark:

     RETR /pub/report
     150 I see that file.
In this case, the file is a binary file, and the server starts sending the file through the separate TCP connection. The browser reads bytes of data from the separate TCP connection until that connection is closed. The browser then waits for another response to the RETR request:
     226 File transferred successfully.
If the response indicates acceptance, the FTP server has sent the complete file through the separate TCP connection. Otherwise the file has been truncated. Either way, the browser quits.

Second possibility. The response to RETR is not a mark:

     RETR /pub/report
     550 Sorry, that isn't a data file.
In this case, the browser sends a CWD request, and waits for a response:
     CWD /pub/report
     250 "/pub/report"
If the server does not accept the CWD request, the browser quits.

Next the browser sends a LIST request, and waits for a response:

     LIST
     150 I see that directory.
If the response is not a mark, the browser quits. Otherwise the file is a directory, and the server starts sending the directory through the separate TCP connection. The browser reads bytes of data from the separate TCP connection until that connection is closed. The browser then waits for another response to the LIST request:
     226 File transferred successfully.
If the response indicates acceptance, the FTP server has sent the complete file through the separate TCP connection. Otherwise the file has been truncated. Either way, the browser quits.

Options in FTP URLs

FTP URLs can be more complicated than the example above:
     ftp://joe@ftp.heaven.af.mil:50021/report;type=d
This URL has several pieces:

Some browsers accept pathnames of the form /./path to mean that /path should be appended to the result of PWD to form the pathname of the desired file.