Homework 2 (L3)
Homework Assignment #2 (50 points)
For this assignment you will have until 5PM on Monday, January 27th to submit on Brightspace. Late assignments will NOT be accepted.
Directions for Students:
Open a new Microsoft Word Document and submit answers to the questions below. The first four lines of your document should contain the following:
- Your name
- MMG3320/5320
- Today's date
- Homework Assignment #2
Part A: Practice using Less
-
This is a multi-part question:
a. Navigate into the
genomics_datafolder.b. Use the
lesscommand to open up the fileEncode-hesc-Nanog.bed.c. Use the shortcut to get to the end of the file.
d. Search for the string
chr11.e. Report two rows that start with
chr11. Include the start and end position in your answer.Exit the
lessbuffer. -
Print to screen the last 5 lines of the file
Encode-hesc-Nanog.bed. Submit a screenshot of the output as your answer. -
How many commands have you typed after going through this exercise? Submit a screenshot of the output as your answer.
Part B: Generating your own script
You got the following line of codes from a trusted source but need to modify it so you can submit it to the VACC-Bluemoon server. You decide its time to make your own script. Follow the steps below:
-
Create a new file in the
otherdirectory calledscript.sh.-
The .sh file extension typically indicates that a file is a shell script.
-
In Unix-like operating systems (such as Linux and macOS), shell scripts are plain text files containing a sequence of commands that can be executed by a shell.
-
-
Paste in the code below to
script.sh.STAR --runThreadN 4 \ --runMode genomeGenerate \ --genomeDir /username/chr1_hg19_STAR_index/ \ --genomeFastaFiles /username/reference_data_ensembl/Homo_sapiens.GRCh19.dna.chromosome.1.fa \ --sjdbGTFfile /username/reference_data_ensembl/Homo_sapiens.GRCh19.gtf -
Replace every occurrence of "username" with your netid.
-
Delete the line containing
--runMode -
Change the
--runThreadNfrom 4 to 6 -
You would also like to use the newest genome assembly, human reference 38 (hg38/GRCh38). Change this as well in your script.
-
Submit a screenshot of your script in the Nano buffer as homework Part B.
Save the file and EXIT.
Please Take Note:
-
The argument
--genomeDiris pointing to an entire directory while--genomeFastaFilesis pointing to a specific file. This is really important as the program is looking for specific files or entire directories (with files in them!) to run successfully. -
Each line here ends with a
\. The\can also be used as an escape character that signals that the character following it has a special meaning in this case its a continuation.