Get File Length over http before you download it

This is one of those forum questions that you “think” you know the answer to, and then you’re proven wrong.
User wants to download a file from a remote site but they do not want to proceed with the download if the file is larger than 10MB. Make sense, right?
I said there was no way to do this without downloading the file. I was wrong. Here’s how he solved his own problem:
static void Main(string[] args)
{
string completeUrl = 
"http://www.eggheadcafe.com/FileUpload/1145921998_ObjectDumper.zip";
WebClient obj = new WebClient();
Stream s = obj.OpenRead(completeUrl);
Console.WriteLine( obj.ResponseHeaders["Content-Length"].ToString());
s.Close();
obj = null;
Console.ReadLine();
}

The above correctly reports the file size of 85,827 bytes without ever downloading the file!

Somebody had a problem. Instead of giving up (or worse, taking my so-called “expert advice”) he thought  “outside the box” and found a solution. I call that outstanding!

NOTE: At least one commenter pointed out that a HEAD request is the most efficient. That's true, but in my experience not all HEAD requests work on all sites so you actually may need to make 2 requests if the first one fails.

Comments

  1. That's a neat trick!

    ReplyDelete
  2. Alternatively, you can make a HEAD request to that web resource, see:

    http://forrst.com/posts/HEAD_requst_to_get_ContentLength-yVh

    ReplyDelete
  3. Anonymous8:37 PM

    Peter, unfortunately what you (and the forum poster) are stating is only partially true. Although the entire file is not downloaded, part of the file is downloaded.

    As soon as the server responds to your GET request, it sends the HTTP headers along with the beginning of the file (whatever can fit in the few few packets). Once .NET processes the first few packets of data, the code above forcibly terminates the connection with the server.

    However, if you look at the wireshark capture below, you can see some of the zip file was sent across the wire. The commenter who posted about using HEAD is stating the correct mechanism for doing what the poster asked.

    Hope that helps,
    John

    GET /FileUpload/1145921998_ObjectDumper.zip HTTP/1.1
    Host: www.eggheadcafe.com
    Connection: Keep-Alive

    HTTP/1.1 200 OK
    Content-Type: application/x-zip-compressed
    Last-Modified: Sat, 28 Aug 2010 20:00:56 GMT
    Accept-Ranges: bytes
    ETag: "ad35ebaeb46cb1:0"
    Server: Microsoft-IIS/7.0
    X-Powered-By: ASP.NET
    Date: Mon, 30 Aug 2010 01:23:29 GMT
    Content-Length: 85827

    PK...........=,.O.....p.......ObjectDumper.sln...n.@.......^..%.....v.m..0.....j.A.....>Y.}..BwS...*..3.g......O]...Tdb...<+....E....I.s13(OX..".FR..L%m..P.....h.l/./l.....C........v...". .

    (I truncated the wireshark capture here)

    ReplyDelete
  4. This means loading the entire file in memory to get the size? looks so inefficient to me.

    ReplyDelete
  5. @Zubair.NET!
    As Huseyin and John both pointed out, making a HEAD request is the best solution.

    ReplyDelete
  6. As soon as the server responds to your GET request, it sends the HTTP headers along with the beginning of the file (whatever can fit in the few few packets). Once .NET processes the first few packets of data, the code above forcibly terminates the connection with the server.

    ReplyDelete

Post a Comment

Popular posts from this blog

Some observations on Script Callbacks, "AJAX", "ATLAS" "AHAB" and where it's all going.

IE7 - Vista: "Internet Explorer has stopped Working"

FIREFOX / IE Word-Wrap, Word-Break, TABLES FIX