How to Clone Only a Subdirectory of a Git Repository?

In some scenarios, you may need to work with only a specific subdirectory of a large Git repository. Unfortunately, Git does not support cloning a subdirectory directly. However, there are a few effective workarounds to achieve this, including using sparse checkout or exporting the subdirectory. This guide will walk you through these methods.

Table of Content

  • Approach 1: Using Sparse Checkout
  • Approach 2: Using Git Archive
  • Approach 3: Using Partial Clone (Git 2.19+)
  • Conclusion

Approach 1: Using Sparse Checkout

Sparse checkout allows you to check out only part of the working directory. This is particularly useful for large repositories where you only need a specific subdirectory.

Step 1: Initialize the Repository

First, clone the repository with the –no-checkout option to avoid checking out the files immediately.

git clone --no-checkout <repository-url>
cd <repository-directory>

Step 2: Enable Sparse Checkout

Configure Git to enable sparse checkout.

git sparse-checkout init

Step 3: Define the Subdirectory

Specify the subdirectory you want to clone. For example, if you want to clone the docs subdirectory:

git sparse-checkout set docs

Step 4: Checkout the Subdirectory

Now, checkout the repository. Only the specified subdirectory will be checked out.

git checkout main

Replace main with the appropriate branch name if it differs.

How to Clone Only a Subdirectory of a Git Repository?

Approach 2: Using Git Archive

The git archive command can create an archive of a specific subdirectory. This method doesn’t require cloning the entire repository.

Step 1: Create an Archive

Run the following command to create a tar archive of the desired subdirectory. Replace <repository-url> with your repository URL and <subdirectory> with the path to the subdirectory.

git archive --remote=<repository-url> HEAD:<subdirectory> | tar -x

For example, to archive the docs subdirectory:

git archive --remote=https://github.com/user/repo.git HEAD:docs | tar -x

This will create a local copy of the docs subdirectory.

Approach 3: Using Partial Clone (Git 2.19+)

Partial clone allows you to fetch only necessary objects. While not as precise as sparse checkout, it reduces the amount of data transferred.

Step 1: Clone the Repository with Partial Clone

Use the –filter option to exclude large blobs. This doesn’t directly target subdirectories but can help if your goal is to minimize the download size.

git clone --filter=blob:none <repository-url>
cd <repository-directory>

Step 2: Configure Sparse Checkout

Enable and set sparse checkout as shown in Method 1 to get only the desired subdirectory.

git sparse-checkout init
git sparse-checkout set <subdirectory>
git checkout main

Conclusion

While Git does not provide a direct way to clone only a subdirectory, the methods outlined above offer effective workarounds. Sparse checkout is the most flexible and widely applicable method, allowing you to selectively check out parts of a repository. Using git archive is a simple approach for quickly extracting a subdirectory without cloning the entire repository. Partial clone, combined with sparse checkout, is useful for large repositories with many large files. Choose the method that best fits your needs and work efficiently with specific subdirectories in large repositories.


Contact Us