evan_tech

Previous Entry Share Next Entry
In strace'ing some Python code that fetches a web page:
7139  read(5, "H", 1)                   = 1
7139  read(5, "T", 1)                   = 1
7139  read(5, "T", 1)                   = 1
7139  read(5, "P", 1)                   = 1
7139  read(5, "/", 1)                   = 1
7139  read(5, "1", 1)                   = 1
7139  read(5, ".", 1)                   = 1
7139  read(5, "1", 1)                   = 1
7139  read(5, " ", 1)                   = 1
7139  read(5, "3", 1)                   = 1
7139  read(5, "0", 1)                   = 1
7139  read(5, "2", 1)                   = 1
7139  read(5, " ", 1)                   = 1
7139  read(5, "F", 1)                   = 1
7139  read(5, "o", 1)                   = 1
7139  read(5, "u", 1)                   = 1
One syscall per byte.

My Python memcached bindings did the same thing for memcached responses, but nobody's complained yet. I figured if nobody had complained about the behavior of the HTTP library yet, nobody would notice that either...

Perhaps it isn't really a problem? I don't really know what the overhead of syscalls is, but it seems like it shouldn't be insignificant, especially since it's multiplied by the number of bytes in the response. But maybe it is, when compared to the overhead of these interpreted languages? Googling implies that Linux syscalls are fast (at least as compared to FreeBSD's), but I don't know how fast.

What's more likely is that people haven't written performance intensive apps with it yet.


PS: You can compare against Perl with something simple like
strace perl -MLWP::Simple -e "get 'http://www.google.com'"
and see that it uses
read(3, "HTTP/1.0 200 OK\r\nCache-Control: "..., 8192) = 2436
i.e., it reads the entire page in one syscall.
And Ruby's
strace ruby -rnet/http -e 'Net::HTTP.new("www.google.com").get("/")'
at least does 1kb.