This repository contains all the course materials for MMG3320/5320 Advanced Bioinformatics course
This project is maintained by PRodriguez19
~ # home dir
. # current dir
.. # parent dir
* # wildcard
ctrl + c # cancel current command
ctrl + l # clear your terminal screen
cat # prints out the all the contents of file
less # allows you to view and move through file content
head # allows you to view beginning of file
tail # allows you to view end of file
The wildcard *
Complicated names of files and directories can make your life painful when working on the command line. Here we provide a few useful tips for the names of your files and directories.
Don’t use spaces.
Spaces can make a name more meaningful, but since spaces are used to separate arguments on the command line it is better to avoid them in names of files and directories. You can use - or _ instead (e.g. fastq-data-files/ rather than fastq data files/). To test this out, try typing mkdir fastq data files and see what directory (or directories!) are made when you check with ls -F.
Don’t begin the name with - (dash).
Don’t begin the name with numbers.
Stick with letters in the beginning and then use numbers, . (period), - (dash), or an _ (underscore) in the middle of the file or directory name.
You may have noticed by now that all the files we are using are named ‘something dot something’.
Class Activity
Before moving on, please complete the following class activity below. You will have ~5 minutes to answer all questions except the final one!
We’ve been able to do a lot of work with files that already exist, but what if we want to create our own files?
In order to create or edit files we will need to use a text editor. When we say, “text editor,” we really do mean “text”. These editors can only work with plain character data, not tables, images, or any other media. Text editors can generally be grouped into two categories: command-line editors and graphical user interface editors.
Some popular editors include:
These are editors which are generally available for use on high-performance compute clusters. There are also simpler editors available for use on the cluster (e.g. Nano), but tend to have limited functionality. We will discuss Nano and Vim in this lesson.
Nano is a simple text editor for UNIX/Linux operating systems. Nano is easy-to-use but has its’ limitations.
To create a new file or edit an existing one type:
nano filename
Type the following in your terminal:
nano colors.txt
After pressing the Enter key, the nano editor appears. Notice the following elements:
At this point we can begin typing:
red
blue
yellow
Notice that after your first keystroke, the word “Modified appears in the upper-right corner. This shows that you have changed the contents of your file but it has not been saved yet.
Saving your work: To save your edited file to disk, press Ctrl-o. Nano displays the current filename. (To save the file under a different name, delete the filename that Nano displays and type a new one.) Press Enter.
Exiting Nano: To exit Nano, press + x. If you made any changes since the last save, Nano will ask whether or not to save them. Type y
for yes or n
for no. Press Enter.
Summary Basic nano commands
key action + X exit from the editor + A Let’s you jump from the beginning of the line + E Let’s you jump from the end of the line + V Scroll page down + Y Scroll page up + O Save the file + K It cuts the entire selected line
Class activity #2 You will have ~5 minutes to complete
get-pip.py
from this location /gpfs1/cl/mmg3320/course_materials/tutorialsget-pip.py
Vim is another text editor, but it is much more powerful than Nano because it offers extensive text editing options. We will explore some of the differences.
How do I keep track of all these shortcuts in Vim?
To help you remember some of the keyboard shortcuts that are introduced and to allow you to explore additional functionality on your own, hbctraining has already compiled a cheatsheet linked here. Download it to your computer, it is a useful resource to have open while using Vim.
![]()
You can create a document by calling a text editor (in our case vim
) and providing the name of the document you wish to create.
Change directories to the unix_lesson/other
and create a document using called draft.txt
using the vim
command:
vim draft.txt
Notice the
"draft.txt" [New File]
typed at the bottom left-hand section of the screen. This tells you that you just created a new file in vim.
Vim has two basic modes that will allow you to create documents and edit your text:
command mode (default mode): will allow you to save and quit the program (and execute other more advanced commands).
insert (or edit) mode: will allow you to write and edit text
Upon creation of a file, vim is automatically in command mode. Let’s change to insert mode by typing . Note the --INSERT--
at the bottom left hand of the screen.
Now type in a few lines of text:
After you have finished typing, press to enter command mode.
Note the
--INSERT--
has now disappeared from the bottom of the screen.
Review of Vim modes
key action insert mode - to write and edit text command mode - to issue commands / shortcuts
To “write to file” or save the modifications made to the file, type when in command mode. You can see the commands you type in the bottom left-hand corner of the screen.
After you have saved the file, the total number of lines and characters in the file will print out at the bottom left-hand section of the screen.
Alternatively, we can write to file (save changes) and quit all at once by typing . After typing while on command mode, you will exist vim and be returned back to your command prompt.
Review of saving and quitting
key (in command mode) action to write to file (save) to write to file and quit to quit without saving
Class activity #3 You will have ~5 minutes to complete
spider.txt
using vim
.
Once you have finished typing, you can display line numbers by changing to command mode and then typing the :set number command. Later, if you choose to remove the line numbers you can reset it with :set nonumber.
key (in command mode) | action |
---|---|
to number lines | |
to remove line numbers |
Save the document using
Now while in command mode, try moving around the file spider.txt
and familiarizing yourself with some of these shortcuts!
Navigating around the file
key (in command mode) | action |
---|---|
to move to top of file | |
to move to bottom of file | |
to move to end of line | |
to move to beginning of line | |
to move to next word | |
to move to previous word |
Practice some of the editing shortcuts, then quit the document and remember to save changes.
Editing the file
key (in command mode) | action |
---|---|
to delete word | |
to delete line | |
to undo | |
to redo | |
to search for a pattern (n/N to move to next/previous match) | |
to search for a pattern and replace for all occurrences |
Class Exercise #4
spider.txt
and delete the word “water” from line #2. Note, you will need to be at the first letter of the word, to delete the entire word!A GUI is an interface that has buttons and menus that you can click on to issue commands to the computer and you can move about the interface just by pointing and clicking. These include BBEdit and Visual Studio Code, which allow you to write and edit plain text documents. These editors often have features to easily search text, extract text, and highlight syntax from multiple programming languages. They are great tools, and indeed you should download one to use to create your own scripts in the future!
Open a new Microsoft Word Document and submit two screenshot (Part A, B, and C). The first four lines of your document should contain the following:
Part A: Class Exercise #4 output.
Part B: Generating your own script You got the following line of codes from a trusted source but need to modify it so you can submit it to the VACC-Bluemoon server. You decide its time to make your own script. Follow the steps below:
script.sh
in a text editor of your choice.
Paste in the code below to script.sh
.
STAR --runThreadN 4 \
--runMode genomeGenerate \
--genomeDir /username/chr1_hg19_STAR_index/ \
--genomeFastaFiles /username/reference_data_ensembl/Homo_sapiens.GRCh19.dna.chromosome.1.fa \
--sjdbGTFfile /username/reference_data_ensembl/Homo_sapiens.GRCh19.gtf
Please Take Note:
\
. This is an escape character that signals that the character following it has a special meaning in this case its a continuation.Part C: Using “vim” one more time
update_hg19_chromosomes.txt
from this location /gpfs1/cl/mmg3320/course_materials/tutorialsThis lesson has been developed by members of the teaching team at the Harvard Chan Bioinformatics Core (HBC). These are open access materials distributed under the terms of the Creative Commons Attribution license (CC BY 4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.