Getting Reports from Git

Amir Ebrahimi Fard
Data Management for Researchers
5 min readJul 26, 2021

--

Photo by Isaac Smith on Unsplash

It is essential to be able to inspect the status and history of the Git system at any moment. Luckily, Git provides multiple ways of getting reports — here we introduce three of them. One of the most useful commands in Git is git status. This provides information about the status of branches, modifications in the working directory, and the stage¹. Depending on the status of the files, this command returns different outputs which are explained in Table 1.

Table 1: Four different kinds of messages that git status may produce, depending on the status of the working directory and stage [1].

Another command used to get a report is git log. This provides a detailed history of all commits. Every commit message is stored as a log entry along with a few components that are shown in Figure 1.

Figure 1: Components of a Git commit log entry.

Figure 2 shows an example of output from running git log command.

Figure 2: An example of output after running the git log command.

The git log command is often used with multiple options, explained in Table 2. At the end of this section, we’ll use a real example to show the differences between some of these options.

Table 2: Different options used with git log.

By running the above-mentioned commands, we get the complete history of commits. Adding the unique identifier of a certain commit to the end of each command makes Git display the history of commits from the beginning up until that particular commit. We could also combine some of the above options. For instance, git log --all --graph --decorate --oneline is a very useful command, especially when we have multiple branches.

If we are interested in checking just one commit, we could use git show <commit-SHA>. This command may be used with the patch and stat switches. Another command for getting a report is git diff which shows differences between two commits [2]. This command can be used in several different ways:

Table 3: Different options used with git diff .

Reading the outcome of git diff might be nonintuitive, so in the following we explain different parts using a simple example displayed in Figure 3.

Figure 3: Sample output from running the git diff command [2].

The outcome of git diff comprises four elements:

  1. file comparison: This field displays the files that are compared. In most cases, a and b will be the same file, but different versions².
  2. file metadata: The first two numbers represent the hashes (or, simply put: “IDs”) of two files. The last number is an internal file mode identifier (100644 is just a “normal file”, while 100755 specifies an executable file and 120000 represents a symbolic link).
  3. markers for the files: Because further down a and b are compared line by line, here we assign a symbol to each file.
  4. chunk: the outcome of the git diff command does not show the entire file in two different commits. Instead it only shows the parts that are actually modified. Additionally, a chunk sometimes comprises some (unchanged) lines before and after the modification so you can better understand the context in which that change happened.
  • chunk header: The first line of a chunk is the chunk header. Enclosed in two @@ signs, there are two pairs, telling which lines from which file are extracted. Each pair is specified by a symbol corresponding to a or b. The first number in every pair represents the starting line and the second number shows how many lines from that file are extracted.
  • chunk changes: Each changed line is prepended with either a “+” or a “-” symbol.

For instance, in this example, after running git diff for two commits, the output returned shows differences in two chunks. In the first one, two different versions of file1.txt are shown. While in the b version, there is only one line (represented by “+” sign), the a version (represented by “-” sign) has two lines. The second chunk inspects file2.txt, but because in one of the commits file2.txt has not been created, there is only one version available (a represented with the “-” sign). This chunk shows in the first commit there was no file2.txt and in the second commit that file has been created (with one line in it).

Sometimes the two commands git log and git diff are combined with each other. There are two forms of combining those two commands. The first one is git log -— stat which shows the output of git log in addition to a summary of changes from the prior commit. As Figure 4 shows, for every commit, in addition to regular components of the log message, the number of line insertions and deletions to each file are represented [3].

Figure 4: An example of running git log — stat.

The other command is git log -p which returns more detailed information about each commit. As Figure 5 displays, it literally combines the functionality of the git log and git diff commands by showing exactly what has happened to the files since the last commit.

Figure 5: An example of running git log -p.

Footnotes

  1. Here, we only focus on the information on the modifications of the working directory and the staging area. The information on the branches will be explained later.
  2. Although not used very often, a diff could also compare two completely unrelated files with each other to show how they differ.

--

--

Amir Ebrahimi Fard
Data Management for Researchers

Postdoc Researcher on AI Explainability - Interested in the intersection of data, algorithm, and society.