caching.rst 9.0 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278
  1. *******
  2. Caching
  3. *******
  4. lazr.restfulclient automatically caches the responses to its requests
  5. in a temporary directory.
  6. >>> import httplib2
  7. >>> httplib2.debuglevel = 1
  8. >>> from lazr.restfulclient.tests.example import CookbookWebServiceClient
  9. >>> service_with_cache = CookbookWebServiceClient()
  10. send: 'GET /1.0/ ...
  11. reply: ...200...
  12. ...
  13. header: Content-Type: application/vnd.sun.wadl+xml
  14. ...
  15. send: 'GET /1.0/ ...
  16. reply: ...200...
  17. ...
  18. header: Content-Type: application/json
  19. ...
  20. >>> print service_with_cache.recipes[4].instructions
  21. send: 'GET /1.0/recipes/4 ...
  22. reply: ...200...
  23. ...
  24. Preheat oven to...
  25. The second and subsequent times you request some object, it's likely
  26. that lazr.restfulclient will make a conditional HTTP GET request instead of
  27. a normal request. The HTTP response code will be 304 instead of 200,
  28. and lazr.restfulclient will use the cached representation of the object.
  29. >>> print service_with_cache.recipes[4].instructions
  30. send: 'GET /1.0/recipes/4 ...
  31. reply: ...304...
  32. ...
  33. Preheat oven to...
  34. This is true even if you initially got the object as part of a
  35. collection.
  36. >>> recipes = service_with_cache.recipes[:10]
  37. send: ...
  38. reply: ...200...
  39. >>> first_recipe = recipes[0]
  40. >>> first_recipe.lp_refresh()
  41. send: ...
  42. reply: ...304...
  43. Note that if you get an object as part of a collection and then get it
  44. some other way, a conditional GET request will *not* be made. This is
  45. a shortcoming of the library.
  46. >>> service_with_cache.recipes[first_recipe.id]
  47. send: ...
  48. reply: ...200...
  49. The default lazr.restfulclient cache directory is a temporary directory
  50. that's deleted when the Python process ends. (If the process is
  51. killed, the directory will stick around in /tmp.) It's much more
  52. efficient to keep a cache directory across multiple uses of
  53. lazr.restfulclient.
  54. You can provide a cache directory name as argument when creating a
  55. Service object. This directory will fill up with cached HTTP
  56. responses, and since it's a directory you control it will persist
  57. across lazr.restfulclient sessions.
  58. >>> import tempfile
  59. >>> tempdir = tempfile.mkdtemp()
  60. >>> first_service = CookbookWebServiceClient(cache=tempdir)
  61. send: 'GET /1.0/ ...
  62. reply: ...200...
  63. ...
  64. send: 'GET /1.0/ ...
  65. reply: ...200...
  66. ...
  67. >>> print first_service.recipes[4].instructions
  68. send: 'GET /1.0/recipes/4 ...
  69. reply: ...200...
  70. ...
  71. Preheat oven to...
  72. This will save you a *lot* of time in subsequent sessions, because
  73. you'll be able to use cached versions of the initial (very expensive)
  74. documents. A new client will not re-request the service root at all.
  75. >>> second_service = CookbookWebServiceClient(cache=unicode(tempdir))
  76. You'll also be able to make conditional requests for many resources
  77. and avoid transferring their full representations.
  78. >>> print second_service.recipes[4].instructions
  79. send: 'GET /1.0/recipes/4 ...
  80. reply: ...304...
  81. ...
  82. Preheat oven to...
  83. Of course, if you ever need to clear the cache directory, you'll have
  84. to do it yourself.
  85. Cleanup.
  86. >>> import shutil
  87. >>> shutil.rmtree(tempdir)
  88. Cache expiration
  89. ----------------
  90. The '1.0' version of the example web service, which we've been using up til
  91. now, sets a long cache expiry time for the service root. That's why we
  92. were able to create a second client that didn't request the service
  93. root at all--just fetched the representations from its cache.
  94. The 'devel' version of the example web service sets a cache expiry
  95. time of two seconds. Let's see what that looks like on the client side.
  96. >>> tempdir = tempfile.mkdtemp()
  97. >>> first_service = CookbookWebServiceClient(
  98. ... cache=tempdir, version='devel')
  99. send: 'GET /devel/ ...
  100. reply: ...200...
  101. ...
  102. send: 'GET /devel/ ...
  103. reply: ...200...
  104. ...
  105. Now let's wait for three seconds to make sure the representations become
  106. stale.
  107. >>> from time import sleep
  108. >>> sleep(3)
  109. When the representations are stale, a new client makes *conditional*
  110. requests for the representations. If the conditions fail (as they do
  111. here), the cached representations are considered to have been
  112. refreshed, just as if the server had sent them again.
  113. >>> second_service = CookbookWebServiceClient(
  114. ... cache=tempdir, version='devel')
  115. send: 'GET /devel/ ...
  116. reply: ...304...
  117. ...
  118. send: 'GET /devel/ ...
  119. reply: ...304...
  120. ...
  121. Let's quickly create another client before the representation grows
  122. stale again.
  123. >>> second_service = CookbookWebServiceClient(
  124. ... cache=tempdir, version='devel')
  125. When the representations are not stale, a new client does not make any
  126. HTTP requests at all--it fetches representations direct from the
  127. cache.
  128. Cleanup.
  129. >>> httplib2.debuglevel = 0
  130. >>> shutil.rmtree(tempdir)
  131. Cache filenames
  132. ---------------
  133. lazr.restfulclient caches HTTP repsonses in individual files named
  134. after the URL accessed. This is behavior derived from httplib2, but
  135. lazr.restfulclient does two things differently from httplib2.
  136. To see these two things, let's set up a client that uses a temporary
  137. directory as a cache file. The directory starts out empty.
  138. >>> from os import listdir
  139. >>> tempdir = tempfile.mkdtemp()
  140. >>> len(listdir(tempdir))
  141. 0
  142. As soon as we create a client object, though, lazr.restfulclient
  143. fetches a JSON and a WADL representation of the service root, and
  144. caches them individually.
  145. >>> service = CookbookWebServiceClient(cache=tempdir)
  146. >>> cache_contents = listdir(tempdir)
  147. >>> for file in sorted(cache_contents):
  148. ... print file
  149. cookbooks.dev...application,json...
  150. cookbooks.dev...vnd.sun.wadl+xml...
  151. This is the first difference between lazr.restfulclient's caching and
  152. httplib2's. httplib2 would store all requests for the service root in
  153. a filename based solely on the URL. This effectively limits httplib2
  154. to a single representation of a given resource: the WADL
  155. representation would be overwritten with the JSON
  156. representation. lazr.restfulclient incorporates the media type in the
  157. cache filename, so that WADL and JSON representations are stored
  158. separately.
  159. The second difference has to do with filename length limits. httplib2
  160. caps filenames at about 240 characters so that cache files can be
  161. stored on filesystems with 255-character filename length limits. For
  162. compatibility with eCryptfs filesystems, lazr.restfulclient goes
  163. further, and caps filenames at 143 characters.
  164. To test out the limit, let's create a cookbook with an incredibly
  165. long name.
  166. >>> long_name = (
  167. ... "This cookbook name is amazingly long; so long that it will "
  168. ... "surely be truncated when it is incorporated into a file "
  169. ... "name for the cache. The cache file will contain a cached "
  170. ... "HTTP respone containing a JSON representation of of this "
  171. ... "cookbook, whose name, I repeat, is very long indeed.")
  172. >>> len(long_name)
  173. 281
  174. >>> import datetime
  175. >>> date = datetime.datetime(1994, 1, 1)
  176. >>> book = service.cookbooks.create(
  177. ... name=long_name, cuisine="General", copyright_date=date,
  178. ... price=10.22, last_printing=date)
  179. lazr.restfulclient automatically fetched a JSON representation of the
  180. new cookbook, so it's already present in the cache. Because a
  181. cookbook's URL incorporates its name, and this cookbook's name is
  182. incredibly long, it must have been truncated to fit on disk.
  183. >>> [cookbook_cache_filename] = [file for file in listdir(tempdir)
  184. ... if 'amazingly' in file]
  185. Indeed, the filename has been truncated to fit in the rough
  186. 143-character safety limit for eCryptfs filesystems.
  187. >>> len(cookbook_cache_filename)
  188. 143
  189. Despite the truncation, some of the useful information from the
  190. cookbook's name makes it into the filename, making it easy to find when
  191. manually crawling through the cache directory.
  192. >>> print cookbook_cache_filename
  193. cookbooks.dev...This%20cookbook%20name%20is%20amazingly%20long...
  194. To avoid conflicts caused by truncation, the filename always ends with
  195. an MD5 sum derived from the untruncated URL. Let's create a second
  196. cookbook whose name differs from the first cookbook only at the end.
  197. >>> longer_name = long_name + ": The Sequel"
  198. >>> book = service.cookbooks.create(
  199. ... name=longer_name, cuisine="General", copyright_date=date,
  200. ... price=10.22, last_printing=date)
  201. This cookbook's URL is identical to the first cookbook's URL for far
  202. longer than 143 characters. But since the truncated filename
  203. incorporates an MD5 sum based on the full URL, the two cookbooks are
  204. cached in separate files.
  205. >>> [file1, file2] = [file for file in listdir(tempdir)
  206. ... if 'amazingly' in file]
  207. The filenames are identical up to the last 32 characters, which is
  208. where the MD5 sum begins. But because the MD5 sums are different, they
  209. are not completely identical.
  210. >>> file1[:-32] == file2[:-32]
  211. True
  212. >>> file1 == file2
  213. False
  214. Cleanup.
  215. >>> import shutil
  216. >>> shutil.rmtree(tempdir)