index.rst 3.9 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146
  1. ..
  2. This file is part of lazr.uri.
  3. lazr.uri is free software: you can redistribute it and/or modify it
  4. under the terms of the GNU Lesser General Public License as published by
  5. the Free Software Foundation, version 3 of the License.
  6. lazr.uri is distributed in the hope that it will be useful, but
  7. WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY
  8. or FITNESS FOR A PARTICULAR PURPOSE. See the GNU Lesser General Public
  9. License for more details.
  10. You should have received a copy of the GNU Lesser General Public License
  11. along with lazr.uri. If not, see <http://www.gnu.org/licenses/>.
  12. lazr.uri
  13. ********
  14. The lazr.uri package includes code for parsing and dealing with URIs.
  15. >>> import lazr.uri
  16. >>> print('VERSION:', lazr.uri.__version__)
  17. VERSION: ...
  18. =============
  19. The URI class
  20. =============
  21. >>> from lazr.uri import URI
  22. >>> uri1 = URI('http://localhost/foo/bar?123')
  23. >>> uri2 = URI('http://localhost/foo/bar/baz')
  24. >>> uri1.contains(uri2)
  25. True
  26. These next two are equivalent, so the answer should be True, even through
  27. the "outside" one is shorter than the "inside" one.
  28. >>> uri1 = URI('http://localhost/foo/bar/')
  29. >>> uri2 = URI('http://localhost/foo/bar')
  30. >>> uri1.contains(uri2)
  31. True
  32. The next two are exactly the same. We consider a url to be inside itself.
  33. >>> uri1 = URI('http://localhost/foo/bar/')
  34. >>> uri2 = URI('http://localhost/foo/bar/')
  35. >>> uri1.contains(uri2)
  36. True
  37. In the next case, the string of url2 starts with the string of url1. But,
  38. because url2 continues within the same path step, url2 is not inside url1.
  39. >>> uri1 = URI('http://localhost/foo/ba')
  40. >>> uri2 = URI('http://localhost/foo/bar')
  41. >>> uri1.contains(uri2)
  42. False
  43. Here, url2 is url1 plus an extra path step. So, url2 is inside url1.
  44. >>> uri1 = URI('http://localhost/foo/bar/')
  45. >>> uri2 = URI('http://localhost/foo/bar/baz')
  46. >>> uri1.contains(uri2)
  47. True
  48. Once the URI is parsed, its parts are accessible.
  49. >>> uri = URI('https://fish.tree:8666/blee/blah')
  50. >>> uri.scheme
  51. 'https'
  52. >>> uri.host
  53. 'fish.tree'
  54. >>> uri.port
  55. '8666'
  56. >>> uri.authority
  57. 'fish.tree:8666'
  58. >>> uri.path
  59. '/blee/blah'
  60. >>> uri = URI('https://localhost/blee/blah')
  61. >>> uri.scheme
  62. 'https'
  63. >>> uri.host
  64. 'localhost'
  65. >>> uri.port is None
  66. True
  67. >>> uri.authority
  68. 'localhost'
  69. >>> uri.path
  70. '/blee/blah'
  71. The grammar from RFC 3986 does not allow for square brackets in the
  72. query component, but Section 3.4 does say how such delimeter
  73. characters should be handled if found in the component.
  74. >>> uri = URI('http://www.apple.com/store?delivery=[slow]#horse+cart')
  75. >>> uri.scheme
  76. 'http'
  77. >>> uri.host
  78. 'www.apple.com'
  79. >>> uri.port is None
  80. True
  81. >>> uri.path
  82. '/store'
  83. >>> uri.query
  84. 'delivery=[slow]'
  85. >>> uri.fragment
  86. 'horse+cart'
  87. ====================
  88. Finding URIs in Text
  89. ====================
  90. lazr.uri also knows how to retrieve a list of URIs from a block of
  91. text. This is intended for uses like finding bug tracker URIs or
  92. similar.
  93. The find_uris_in_text() function returns an iterator that yields URI
  94. objects for each URI found in the text. Note that the returned URIs
  95. have been canonicalised by the URI class:
  96. >>> from lazr.uri import find_uris_in_text
  97. >>> text = '''
  98. ... A list of URIs:
  99. ... * http://localhost/a/b
  100. ... * http://launchpad.net
  101. ... * MAILTO:joe@example.com
  102. ... * xmpp:fred@example.org
  103. ... * http://bazaar.launchpad.net/%7ename12/firefox/foo
  104. ... * http://somewhere.in/time?track=[02]#wasted-years
  105. ... '''
  106. >>> for uri in find_uris_in_text(text):
  107. ... print(uri)
  108. http://localhost/a/b
  109. http://launchpad.net/
  110. mailto:joe@example.com
  111. xmpp:fred@example.org
  112. http://bazaar.launchpad.net/~name12/firefox/foo
  113. http://somewhere.in/time?track=[02]#wasted-years
  114. .. pypi description ends here
  115. .. toctree::
  116. :glob:
  117. NEWS