We will not be using ilearn in this class. Instead, we will be using a version control system called git
. Version control systems are widely used in industry and in open source projects. They are the tool that lets many programmers work together on large, complex software. I don’t know what programming language you will use at your future job (it may not even exist yet!), but I guarantee you will be using version control.
In this lab, you will first learn the basics of how to use git
and github. Then, we will discuss how to use these tools to access your grades and submit assignments.
creating your first repo (and some basic unix commands)
Open a terminal, and cd
into the directory you will be doing your cs100 work in. Then create a folder named firstrepo
and cd
into it:
$ mkdir firstrepo
$ cd firstrepo
This folder will be the home of your first git repository. Run the following command to initialize it:
$ git init
All the information for the git repository is stored in a hidden folder called .git
. The folder is hidden because it begins with a .
. By default, the ls
command does not display these hidden folders. To display them, you must pass the -a
flag. Compare the results of the following two commands:
$ ls
$ ls -a
Now we are ready to add some files into our repo. Every repo in this class must have a README
file. Create the file using the following command:
$ touch README
The touch
command is a standard unix command. If the input file does not already exist, touch
creates an empty file with that name. If the file does already exist, it updates the file’s timestamp to the current time. The ls -l
command displays the full information about each file in the current directory. Run the following commands:
$ ls -l
$ touch README
$ ls -l
Notice how the timestamp in the first ls
is different than the timestamp in the second ls
.
We’ve created our first file, but git doesn’t know about it yet. Run the command:
$ git status
In the output, there is a section labeled “untracked files.” Notice that the README
file is in this section. We need to add it into our project using the command:
$ git add README
Now, when we run git status
, there is a section labeled “Changes to be committed” with the README
file underneath it.
Whenever we finish a task in our repo, we “commit” our changes. This tells git to save the state of the repo so that we can come back to it later if we need to. Commit your changes using the command:
$ git commit -m "my first commit"
Every commit needs a “commit message” that describes what changes we made in the repo. Writing clear, succinct, informative commit messages is one of the keys to using git effectively. In this case, we passed the -m
flag to git, so the commit message was specified in the command line. If we did not pass a flag, then git would have opened the vim editor for us to type a longer commit message. Whether or not you use the -m
flag is purely a matter of style, but in my experience, it’s usually easier to add the flag.
Let’s add some actual code to our project. Create a file main.cpp
with the following code:
#include <iostream>
int main()
{
std::cout << "hello git" << std::endl;
return 1;
}
Compile and run the code:
$ g++ main.cpp
$ ./a.out
Then add it to the repo and commit your changes:
$ git add main.cpp
$ git commit -m "added the first code"
IMPORTANT: Notice that we only added the main.cpp
file to our repo, and did not add a.out
. Never add executable or object files to your git repo! Only add source files! Tracking executables uses LOTS of disk space, and makes the repo cluttered and hard to read. If we ever see these files in your git repos, your grade on the assignment will be docked 20%.
Let’s make one more commit so we’ll have something to play with. Run the command:
$ echo "This program prints \"hello git\"" > README
Remember that echo
prints to stdout
and the >
does output redirection. So this command changes the contents of the README
file.
The command cat
prints the contents of a file to stdout
. Verify that your README
file has changed using the command:
$ cat README
Now run:
$ git commit -m "modified the README"
Uh oh!
We got an error message saying: “no changes added to the commit”.
Every time you modify a file, if you want that file included in the commit, you must explicitly tell git to add the file again. This is because sometimes programmers want to commit only some of the modified files. We can commit the changes by:
$ git add README
$ git commit -m "modified the README"
traveling through time
Okay!
Now we’re ready to take advantage of git’s power.
Run the command:
$ git log
This gives us a history of all our commits. For each commit, there are four pieces of information. The first is the commit identifier. This is a long hexadecimal sequence, for example: 093d5fa3c60ce204b6ddba86d4f9c355b4856f10
. (Technically, this is a SHA1 hash of your commit. This hash is “cryptographically secure”, meaning that it is practically guaranteed to be unique. Take cs165 to find out more!) Next is the author of the commit, the date of the commit, and the commit message.
Sometimes, we want to look at what the state of our repo was in a previous commit. There are many reasons this is useful. For example, maybe your latest changes broke some functionality and you want to see what working code looked like. Or, maybe a user of your code reported a bug, but they’re using an old version of the software; we need to look at the old version of the code to reproduce the bug.
We can inspect the previous state of our code using the git checkout
command. This command takes as a parameter the hash of the commit we want to inspect. For me, the hash of “my first commit” is a20aef2096d98ab53d1495f823409e2cc8cd54b9
. So to inspect that commit, I would run:
(You should replace the hash below with the hash of your “my first commit”)
$ git checkout a20aef2096d98ab53d1495f823409e2cc8cd54b9
Now let’s see what happened. Run the command:
$ cat README
The file is empty again!
Now run:
$ ls -l
Your main.cpp
file disappeared! All of the files tracked by git have returned to their previous state. But notice that your a.out
program still exists unmodified. (Run it just to be sure.) This file was never tracked by git, and so it is unmodified when we checkout different commits.
Let’s restore all those changes. Run the command:
$ git checkout master
And verify that our changes have been restored:
$ cat README
$ ls -l
git repos are trees
Another important use of version control systems is working with multiple versions of the same project at once. This is VERY useful. Also, you’ll be required to do this in future homework assignments (and all throughout your illustrious careers), so pay attention!
Every version of our repo is called a “branch.” A project can have many branches, and every branch can be completely different than every other branch. List the branches in your current project using the command:
$ git branch
This should list just a single branch called master
. This branch was created for you automatically when you ran the git init
command.
One way to think of branches is as a nice label for your commit hashes. Your “master” branch currently points to your commit with the message “modified the README.” That’s why when we ran git checkout master
above, it restored our project to the state of that commit. We could also have used git checkout [hash]
, if you replaced [hash]
with the appropriate hash value. But that’s much less convenient. When you use git checkout
in the future, you will usually be using it on branch names.
From now on, we’ll be drawing pictures of our git repos so you can visualize what’s going on. Currently, our repo looks like:
The purple boxes represent all the commits we’ve done, and the blue box represents a branch.
Every time we add a new feature to a project, we create a branch for that feature. Let’s create a branch called userinput
in our project by:
$ git branch userinput
Verify that our branch was created successfully:
$ git branch
You should see two branches now. There should be an asterisk next to the master branch. This tells us that master is the currently active branch, and if we commit any new changes, they will be added to the master branch. (That is, master
will change to point to whatever your new commit is.)
Our repo tree now looks like:
Switch to our new branch using the command:
$ git checkout userinput
Now run:
$ git branch
and verify that the asterisk is next to the userinput
branch.
Since the only thing you did was switch branches, the repo tree looks almost the same.
The only difference is the asterisk has moved.
Let’s modify our main.cpp
file so that it asks the user their name before saying hello:
#include <iostream>
#include <string>
int main()
{
std::string name;
std::cout << "What is your name?" << std::endl;
std::cin >> name;
std::cout << "Hello " << name << "!" << std::endl;
return 1;
}
We commit our changes to the current working branch the same way we committed them before:
$ git add main.cpp
$ git commit -m "added user input"
Before this commit, the userinput
and master
branches were pointing to the same commit. When you run this command, the userinput
branch gets updated to point to this new commit.
Now your tree looks like:
Let’s verify that our changes affected only the userinput branch and not the master branch. First, checkout the master branch, then cat the main.cpp
file, then return to the user input branch.
$ git checkout master
$ cat main.cpp
$ git checkout userinput
We’re not done with this feature yet. Whenever you add a feature, you also have to update the documentation! Properly documenting your code will be a huge part of your grade in this course!
Update the README
file with the command:
$ echo "This program asks the user for their name, then says hello." > README
And add it to the repo:
$ git add README
$ git commit -m "updated README"
Your repo tree now looks like:
The way branches are used out in the real world depends on the company you work for and the product you’re building. A typical software engineer might make anywhere from one new branch per week to 5 or more new branches per day.
fixing a bug
Wait!
While we were working on our userinput
branch, someone reported a bug in our master
branch. In particular, the main function in our master branch returns 1, but a successful program should return 0. In UNIX, any return value other than 0 indicates that some sort of error occurred.
To fix this bug, we first checkout our master branch:
$ git checkout master
Then create a bugfix
branch and check it out:
$ git branch bugfix
$ git checkout bugfix
Here’s the tree.
Notice that the bugfix
branch starts where the master
branch was because we switched to master
before creating bugfix
.
Now we’re ready to edit the code. Update the main
function to return 0, then commit your changes:
$ git add main.cpp
$ git commit -m "fixed the return 1 bug"
Since you made the commit on the bugfix
branch, your tree splits off in another direction and now looks like this:
merging branches
We want our users to get access to the fixed software, so we have to add our bugfix
code into the master
branch. This process is called “merging.”
In this case it is a simple procedure.
First, checkout the master
branch:
$ git checkout master
Then run the command:
$ git merge bugfix
This automatically updates the modified files.
Your tree will now look like this:
Using branches like this to patch bugs is an extremely common usage pattern. Whether you’re developing open source software or working on facebook’s user interface, this is the same basic procedure you will follow.
With real bugs on more complicated software, bug fixes won’t be quite this easy. They might require editing several different files and many commits. It might take us weeks just to find out what’s even causing the bug! By putting our changes in a separate branch, we make it easy to have someone fixing the bug while someone else is adding new features.
merge conflicts
Our userinput
feature is also ready now. We’ve tested it and are sure it’s working correctly. It’s time to merge this feature with the master
branch. Run the commands:
$ git checkout master
$ git merge userinput
Ouch!
We get an error message saying:
Auto-merging main.cpp
CONFLICT (content): Merge conflict in main.cpp
Automatic merge failed; fix conflicts and then commit the result.
This error is called a “merge conflict” and is one of the hardest concepts for new git users to understand. Why did this happen?
In our bugfix
branch above, git automatically merged the main.cpp
file for us. It could do this because the main.cpp
file in the master
branch did not change after we created the bugfix
branch. Unfortunately, after we merged the bugfix
branch into master, this changed the main.cpp
file. Now when git tries to merge our changes from the userinput
branch, it doesn’t know which parts to keep from userinput
, and which parts to keep from bugfix
. We have to tell git how to do this manually.
If you inspect the contents of the main.cpp
file, you’ll see something like:
#include <iostream>
#include <string>
int main()
{
<<<<<<< HEAD
std::cout << "hello git!" << std::endl;
return 1;
=======
std::string name;
std::cout << "What is your name?" << std::endl;
std::cin >> name;
std:::cout << "Hello " << name << "!" << std::endl;
return 0;
>>>>>>> userinput
}
As you can see, the file is divided into several sections. Any line not between the <<<<<<<<
and >>>>>>>>
lines is common to both versions of main.cpp
. The lines between <<<<<<<< HEAD
and =======
belong only to the version in the master
branch. And the lines between =======
and >>>>>>>> userinput
belong only to the userinput
branch.
The key to solving a merge conflict is to edit the lines between <<<<<<<
and >>>>>>>
to include only the correct information from each branch. In our case, we want the return
statement from the master
branch, and all of the input/output from the userinput
branch. So we should modify the main.cpp
file to be:
#include <iostream>
#include <string>
int main()
{
std::string name;
std::cout << "What is your name?" << std::endl;
std::cin >> name;
std::cout << "Hello " << name << "!" << std::endl;
return 0;
}
Once we have resolved this merge conflict, we can finalize our merge. We first tell git that we’ve solved the conflict by adding the conflicting files, then we perform a standard commit:
$ git add main.cpp
$ git commit -m "solved merge conflict between userinput and master branches"
And your tree looks like:
As you can see, resolving merge conflicts is a tedious process. Most projects try to avoid merge conflicts as much as possible. A simple strategy for doing this is using many small source files rather than a few large files. Of course, in most projects merge conflicts will be inevitable. That’s just the reality of working on large projects with many team members.
exercise
Given the same repo above, draw the tree that results after running the following commands. You will have to submit this to the TA before the end of lab.
$ git branch -d userinput
$ git branch -d bugfix
$ echo "everything is awesome" > README
$ git add README
$ git commit -m "changed the README"
You should check the git cheatsheet to figure out what the git branch -d
command does.
Enrollment and advanced tutorial
You should also go through the more advanced git exercises. Advanced
It may help during this class when managing your class programs.
When you are done with that you will still need to enroll in the class. Enrollment