video_mvit.rst 791 B

123456789101112131415161718192021222324252627
  1. Video MViT
  2. ==========
  3. .. currentmodule:: torchvision.models.video
  4. The MViT model is based on the
  5. `MViTv2: Improved Multiscale Vision Transformers for Classification and Detection
  6. <https://arxiv.org/abs/2112.01526>`__ and `Multiscale Vision Transformers
  7. <https://arxiv.org/abs/2104.11227>`__ papers.
  8. Model builders
  9. --------------
  10. The following model builders can be used to instantiate a MViT v1 or v2 model, with or
  11. without pre-trained weights. All the model builders internally rely on the
  12. ``torchvision.models.video.MViT`` base class. Please refer to the `source
  13. code
  14. <https://github.com/pytorch/vision/blob/main/torchvision/models/video/mvit.py>`_ for
  15. more details about this class.
  16. .. autosummary::
  17. :toctree: generated/
  18. :template: function.rst
  19. mvit_v1_b
  20. mvit_v2_s