httpretrieve.repy is slow #140

choksi81 · 2014-06-02T15:33:48Z

httpretrieve.repy gets HTTP header data bytes from a socket one by one. This causes so much overhead that the CPU restriction quickly limits performance. You can easily notice this if you do XML-RPC: On our Android phone with 50% CPU restriction, the largest part of XML-RPC calls take around one second, and this is communication on localhost.

I tried to implement a slightly smarter algorithm to read data from the socket. Since we are really waiting for a four-byte "end of header" sequence (\r\n\r\n), we can read data in chunks of four. If the current chunk ends on \r or \n, it could be (part of) the EOH. I then read in as many bytes I need to construct the complete EOH sequence, but if it doesn't show up this time, I continue doing four-byte blocks.

Consequence: About half of my XML-RPC calls finish in 150ms now; the rest take about 600ms (see the empirical CDFs attached.) Here's the modified code, please let me know what you think:

### httpretrieve.repy, line 166, in function httpretrieve_open

    # Receive the header lines from the web server (a series of CRLF-terminated
    # lines, terminated by an empty line, or by the server closing the
    # connection.
    headersstr = ""

    while not headersstr.endswith("\r\n\r\n"):
      try:
        # Complete possibly pending \r\n's or \r\n\r\n's
        if headersstr.endswith("\r"):
          headersstr += sockobj.recv(1)
          continue
        if headersstr.endswith("\r\n"):
          headersstr += sockobj.recv(2)
          continue

        # Nothing pending, read chunks of four bytes
        while not (headersstr.endswith("\r") or headersstr.endswith("\n")):
          headersstr += sockobj.recv(4)

      except Exception, e:
        if str(e) == "Socket closed":
          break
        else:
          raise

    httpheaderlist = headersstr.split("\r\n")
    # Ignore (a) trailing blank line(s) (for example, the response header-
    # terminating blank line).
    while len(httpheaderlist) > 0 and httpheaderlist[-1] == "":
      httpheaderlist.pop()

The text was updated successfully, but these errors were encountered:

choksi81 · 2014-06-11T23:36:57Z

Attachments:
https://github.com/SeattleTestbed/attic/blob/master/TICKET_ATTACHMENTS/httpretrive_performance.pdf

choksi81 self-assigned this Jun 2, 2014

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

httpretrieve.repy is slow #140

httpretrieve.repy is slow #140

choksi81 commented Jun 2, 2014

choksi81 commented Jun 11, 2014

httpretrieve.repy is slow #140

httpretrieve.repy is slow #140

Comments

choksi81 commented Jun 2, 2014

choksi81 commented Jun 11, 2014