vision_transformer.rst 798 B

12345678910111213141516171819202122232425262728
  1. VisionTransformer
  2. =================
  3. .. currentmodule:: torchvision.models
  4. The VisionTransformer model is based on the `An Image is Worth 16x16 Words:
  5. Transformers for Image Recognition at Scale <https://arxiv.org/abs/2010.11929>`_ paper.
  6. Model builders
  7. --------------
  8. The following model builders can be used to instantiate a VisionTransformer model, with or
  9. without pre-trained weights. All the model builders internally rely on the
  10. ``torchvision.models.vision_transformer.VisionTransformer`` base class.
  11. Please refer to the `source code
  12. <https://github.com/pytorch/vision/blob/main/torchvision/models/vision_transformer.py>`_ for
  13. more details about this class.
  14. .. autosummary::
  15. :toctree: generated/
  16. :template: function.rst
  17. vit_b_16
  18. vit_b_32
  19. vit_l_16
  20. vit_l_32
  21. vit_h_14