View and Clone a Git Repo with the Same URL using Nginx
Recently, I've been experimenting with building a web interface for my own self-hosted git repositories. The goal of the project is to provide a web view for each repository, as well as allow cloning of each repository over HTTPS. The source code for the project can be found here.
However, there was one small feature that stumped me for quite a while. That feature was figuring out how to allow cloning and the web view to use the same URL. It's a little thing, but I found myself coming back to it again and again and trying to make it work. I figured that if GitHub could manage it, then so could I.
Here's the first part of the problem. I wanted the user to be able to access a
web interface for each repository by navigating to
https://git.adamheins.com/<repo>
. I also wanted the user to be able to run
git clone https://git.adamheins.com/<repo>
to clone the repository. It was
straightforward to have https://git.adamheins.com/<repo>.git
work for
cloning, since that URL is differentiated by the .git
ending. This allows the
web server to identify it and treat it as a clone request. The tricky part was
getting clone to work without the .git
ending: the same URL used for the web
view. Somehow, these needed to be differentiated.
The solution is all in the webserver configuration. I'll show how it's done with the webserver I used, nginx.
Here are the relevant parts of the nginx configuration for git.adamheins.com:
server {
listen 443 ssl;
server_name git.adamheins.com;
server_tokens off;
# SSL.
include /etc/nginx/includes/adamheins.com/ssl.conf;
# Static git repo content.
location ~ ^.*/objects/([0-9a-f]+/[0-9a-f]+|pack/pack-[0-9a-f]+.(pack|idx))$ {
root /home/git/;
# Rewrite the URI to include .git extension if it is missing.
if ($uri !~ ^.*\.git.*) {
rewrite ^(/[0-9a-zA-Z\-\_]+)/(.*)$ $1.git/$2;
}
}
# Git content requiring git-http-backend.
location ~ ^.*/(HEAD|info/refs|objects/info/.*|git-(upload|receive)-pack)$ {
root /home/git/;
if ($uri !~ ^.*\.git.*) {
rewrite ^(/[0-9a-zA-Z\-\_]+)/(.*)$ $1.git/$2;
}
client_max_body_size 0;
fastcgi_param SCRIPT_FILENAME /usr/lib/git-core/git-http-backend;
include fastcgi_params;
fastcgi_param GIT_HTTP_EXPORT_ALL "";
fastcgi_param GIT_PROJECT_ROOT /home/git;
fastcgi_param PATH_INFO $uri;
}
# Web frontend.
location / {
proxy_pass http://git;
proxy_redirect off;
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header X-Forwarded-Proto https;
}
}
The first things to notice here are the complicated-looking regexes in the first
two location
blocks. These two blocks are for handling git clone
requests. The first location
block matches content in the repository that is
simply static, so the clone request just needs to know where it is on the
server. The second location
block handles requests that use git's so-called
Smart HTTP functionality.
Together, these two location
blocks handle all requests made by a git clone
command.
The third location
block, which has no complicated regex, serves the web
interface. All requests that do not originate from a git clone
are handled by
this block.
The result of having the three distinct blocks is that web view requests and
git clone
requests are differentiated. The first two blocks serve clone
requests, while the third deals with the web view. The problem of
differentiating between clone and web view requests is now solved.
However, this is not the end of the story. The remaining challenge was to have
the web server figure out how to route a clone request without a .git
ending.
All of my repositories are stored as directories with the .git
ending, so
clone requests that include it just work. Clone requests that omit the .git
ending require some rewriting.
This brings us to the second thing to notice: the if
block that appears in
each of the first two location
blocks.
if ($uri !~ ^.*\.git.*) {
rewrite ^(/[0-9a-zA-Z\-\_]+)/(.*)$ $1.git/$2;
}
Let's tackle the if
block in pieces. The if
block's condition is:
$uri !~ ^.*\.git.*
This condition tests whether the URL contains the .git
ending. If the URL
does not, the body of the if
block is executed. The body contains the
following rewrite rule:
rewrite ^(/[0-9a-zA-Z\-\_]+)/(.*)$ $1.git/$2;
The rewrite rule has two parameters: a regex to match the URL, and a rule for how to rewrite the URL if it matches the regex.
The regex, ^(/[0-9a-zA-Z\-\_]+)/(.*)$
, contains two match groups. The first
group, /[0-9a-zA-Z\-\_]+
, matches the initial slash and the repository name,
which can consist of letters, numbers, dashes, and underscores. The second
group, .*
, matches everything after the slash following the repository name.
For example, let's say the requested URL is /my-repo/stuff/in/repo
. This
request is trying to access content in a repository called my-repo
(which is
actually stored as my-repo.git
). The first match group in the regex will
match /my-repo
. The second match group will match stuff/in/repo
.
Since the URL matches the regex, it gets rewritten to the form $1.git/$2
,
where $1
and $2
denote the first and second match groups from the regex,
respectively. Ultimately, the rule rewrites the URL so that .git/
is placed
between the two match groups. Thus the URL from the above example,
/my-repo/stuff/in/repo
, gets rewritten to /my-repo.git/stuff/in/repo
. Now
the URL is exactly the same as if it had included the .git
ending in the
first place.
In summary, this configuration allows clone and web view requests for a
repository to use the same URL. As a bonus, cloning works both with and without
the .git
ending.