__init__.py 2.8 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990
  1. """
  2. ================================
  3. Datasets (:mod:`scipy.datasets`)
  4. ================================
  5. .. currentmodule:: scipy.datasets
  6. Dataset Methods
  7. ===============
  8. .. autosummary::
  9. :toctree: generated/
  10. ascent
  11. face
  12. electrocardiogram
  13. Utility Methods
  14. ===============
  15. .. autosummary::
  16. :toctree: generated/
  17. download_all -- Download all the dataset files to specified path.
  18. clear_cache -- Clear cached dataset directory.
  19. Usage of Datasets
  20. =================
  21. SciPy dataset methods can be simply called as follows: ``'<dataset-name>()'``
  22. This downloads the dataset files over the network once, and saves the cache,
  23. before returning a `numpy.ndarray` object representing the dataset.
  24. Note that the return data structure and data type might be different for
  25. different dataset methods. For a more detailed example on usage, please look
  26. into the particular dataset method documentation above.
  27. How dataset retrieval and storage works
  28. =======================================
  29. SciPy dataset files are stored within individual github repositories under the
  30. SciPy GitHub organization, following a naming convention as
  31. ``'dataset-<name>'``, for example `scipy.datasets.face` files live at
  32. https://github.com/scipy/dataset-face. The `scipy.datasets` submodule utilizes
  33. and depends on `Pooch <https://www.fatiando.org/pooch/latest/>`_, a Python
  34. package built to simplify fetching data files. Pooch uses these repos to
  35. retrieve the respective dataset files when calling the dataset function.
  36. A registry of all the datasets, essentially a mapping of filenames with their
  37. SHA256 hash and repo urls are maintained, which Pooch uses to handle and verify
  38. the downloads on function call. After downloading the dataset once, the files
  39. are saved in the system cache directory under ``'scipy-data'``.
  40. Dataset cache locations may vary on different platforms.
  41. For macOS::
  42. '~/Library/Caches/scipy-data'
  43. For Linux and other Unix-like platforms::
  44. '~/.cache/scipy-data' # or the value of the XDG_CACHE_HOME env var, if defined
  45. For Windows::
  46. 'C:\\Users\\<user>\\AppData\\Local\\<AppAuthor>\\scipy-data\\Cache'
  47. In environments with constrained network connectivity for various security
  48. reasons or on systems without continuous internet connections, one may manually
  49. load the cache of the datasets by placing the contents of the dataset repo in
  50. the above mentioned cache directory to avoid fetching dataset errors without
  51. the internet connectivity.
  52. """
  53. from ._fetchers import face, ascent, electrocardiogram # noqa: E402
  54. from ._download_all import download_all
  55. from ._utils import clear_cache
  56. __all__ = ['ascent', 'electrocardiogram', 'face',
  57. 'download_all', 'clear_cache']
  58. from scipy._lib._testutils import PytestTester
  59. test = PytestTester(__name__)
  60. del PytestTester