io.rst 2.2 KB

1234567891011121314151617181920212223242526272829303132333435363738394041424344454647484950515253545556575859606162636465666768697071727374757677787980818283848586878889
  1. Decoding / Encoding images and videos
  2. =====================================
  3. .. currentmodule:: torchvision.io
  4. The :mod:`torchvision.io` package provides functions for performing IO
  5. operations. They are currently specific to reading and writing images and
  6. videos.
  7. Images
  8. ------
  9. .. autosummary::
  10. :toctree: generated/
  11. :template: function.rst
  12. read_image
  13. decode_image
  14. encode_jpeg
  15. decode_jpeg
  16. write_jpeg
  17. encode_png
  18. decode_png
  19. write_png
  20. read_file
  21. write_file
  22. .. autosummary::
  23. :toctree: generated/
  24. :template: class.rst
  25. ImageReadMode
  26. Video
  27. -----
  28. .. autosummary::
  29. :toctree: generated/
  30. :template: function.rst
  31. read_video
  32. read_video_timestamps
  33. write_video
  34. Fine-grained video API
  35. ^^^^^^^^^^^^^^^^^^^^^^
  36. In addition to the :mod:`read_video` function, we provide a high-performance
  37. lower-level API for more fine-grained control compared to the :mod:`read_video` function.
  38. It does all this whilst fully supporting torchscript.
  39. .. betastatus:: fine-grained video API
  40. .. autosummary::
  41. :toctree: generated/
  42. :template: class.rst
  43. VideoReader
  44. Example of inspecting a video:
  45. .. code:: python
  46. import torchvision
  47. video_path = "path to a test video"
  48. # Constructor allocates memory and a threaded decoder
  49. # instance per video. At the moment it takes two arguments:
  50. # path to the video file, and a wanted stream.
  51. reader = torchvision.io.VideoReader(video_path, "video")
  52. # The information about the video can be retrieved using the
  53. # `get_metadata()` method. It returns a dictionary for every stream, with
  54. # duration and other relevant metadata (often frame rate)
  55. reader_md = reader.get_metadata()
  56. # metadata is structured as a dict of dicts with following structure
  57. # {"stream_type": {"attribute": [attribute per stream]}}
  58. #
  59. # following would print out the list of frame rates for every present video stream
  60. print(reader_md["video"]["fps"])
  61. # we explicitly select the stream we would like to operate on. In
  62. # the constructor we select a default video stream, but
  63. # in practice, we can set whichever stream we would like
  64. video.set_current_stream("video:0")