If you are a developer, you may be acquainted with Git technology and you may
be using it to some
extent in your daily activities. If you are an IT professional who spends most
of your time playing with infrastructure issues, you may not have had a chance
to use it yet.
Nowadays, some of the hottest topics in
our industry are cloud and DevOps, thus understanding how Git works will
definitely help us to understand in more detail how this technology is being
used in those realms.
More and more IT pros are moving toward the uncharted waters
such as infrastructure-as-a code(IaaC), scripts, automation and so
forth. It is common to see an IT pro, like the humble author of this article,
using Visual Studio to deploy Azure Infrastructure, use of
version control in scripts, and many other scenarios.
To top it all off, Microsoft recently acquired GitHub, a software-as-a-service
(SaaS) solution that integrates with Git to synchronize and keep consistency
and empower teamwork to collaborate and develop code faster than ever before.
What is Git, by the way? Git is a version-control tool that
is lightweight, and it was created to support the Linux kernel maintenance.
Among a vast number of contributors, the software was created by Linus Torvalds
(yep, the guy who created Linux) and since its inception (2005), it has been
used in development projects of all shapes and sizes.
You may be saying, “I’m an IT pro! There is nothing for me here, right?”
Well, using Git capabilities helps virtually any professional who needs to
implement version control and want to be organized. It is not just for
developers — you may even want to use for your documentation files.
So, the short answer is, there are a lot of
things that you can take advantage of using Git and it does not hurt understand
a pretty cool technology, does it?
In this article, we will focus on the
basic concepts to understand Git In our next article we will use Git with some
PowerShell scripts that we are developing to demonstrate scenarios where Git
can be useful even for local version control. Later on, we will finish up with
an article here at It Jankari covering the integration between Git and GitHub.
The basics
First and most important, Git has a
local repository/database to keep track of the changes and that is done per
directory. Every time that we initialize Git in any given folder, a folder
structure will be created to support the versioning of that folder and
subfolders (we can see the structure in the image below). Everything in Git
uses SHA-1 hash, which is 40 characters long, and the hash is based on the
content of the file and we will see those hashes all over the place.
The second important point is to
understand the state of any given file: They can be in three different states:
·
Committed: Data was saved in
the local database — Git has it covered, don’t worry about this one.
·
Modified: The file was
modified, but it is not in the local database or staged. It means that there is
change in the file, and we have to take an action if we want to keep it in our
version control.
·
Staged: These are the
files that you defined the modified file to be part of your next commitment. If
the file is staged, the snapshot is already there just waiting for the commit
process.
It is important to understand that
the information saved in the local databases are snapshots of the files. If the
file wasn’t changed between commits, then the file is not copied over again —
just a reference to the non-altered file will be used. All the consistency to
check if the file was modified or not is based on the checksums that it is
performed.
When a commit operation is executed,
a new blob will be created for that specific commit operation, which contains
the tree reference (it may have reference for previous commits, and they are
referred as parents), another blob for the tree
itself (and it contains links of all the blobs for every single file being part
of this current commit), and one blob for each file part of the commit. All
that information is using the checksum.
A third important item in the Git
universe is the branch feature. Using branch, we can keep a
mainstream of development and allows several development lines to diverge from
this main line and that can be used to test/validate new features, fix issues,
and so forth.
A branch is just a different pointer
and we can switch between branches as we wish. It will impact the content and
files of our folder that is being controlled by Git. By default, any new Git
repository will have a master branch where is the location
where all the mainstream changes are occurring, but we can create different
branches to tackle different areas of development and merge them back to the
master later on.
Git has a special pointer
called head, which helps us to understand where the current branch
is located.
How do we work with Git?
These are a few ideas about the key
principles of Git, but I do understand that you haven’t yet touched the command
prompt to test it out. Before going there, we need to understand the process
that a regular user will use locally.
1.
You have worked on files and changes were made.
2.
You stage the files manually using the Git command line.
3.
You commit the staged files into the local database.
These are the basic steps to work
with Git, and having those concepts understood makes it easier to start going
on to more advanced uses and creating different branches, merging them
afterward, integrating with GitHub, and so forth.
I know that we haven’t had a chance
to have a lot of action in this article but stay tuned. Our next article will
involve several scenarios using Git to help our productivity and organization
of scripts. But keep in mind that it can be used for a variety of things, not
just scripts.
0 Comments