R/R_Usage [Eng.]

Tax4Fun2 Analysis Tutorial in Rstudio [16S Functional Prediction]

Cha-Nyong 2022. 11. 28. 20:34

Hello,

 

This posting is about 16S rRNA microbiomal function prediction analysis.

 

There are three main tools for functional prediction of bacterial microbiomes

such as PICRUSt, FAPROTAX, Tax4Fun2.

 

All of these contains weak points and strong points.

In my case, i prefer to use Tax4Fun2 because of analysis pipeline.

 

Brifely, 16S rRNA sequence data will be blasted, 

and highest similarity chromosome will downloaded.

These chromosome will annotated to KEGG. 

Finally we can get pathway prediction of level 1 to 3 and even functions at enzyme levels.

This has pretty good accuracy according to related SCI paper.

 

 

 

Now, let's do analysis with follwing steps.

 

 

 

 

 

 

 

 

 

1. Install packages

 

The package name is "Tax4Fun2" in Rstudio.

When we install packages, usually below command is used.

 

install.packages("packagename")

However, we can not install the Tax4Fun2 with above commands.

We should install with default method.

 

 

 

 

 

 

1.1 Download package installing file.

 

The package installing file is attached here (named "Tax4Fun2_1.1.5.tar.gz)

Or you can install with github webpage below.

(https://github.com/ZihaoShu/Tax4Fun2/blob/main/Tax4Fun2_1.1.5.tar.gz)

 

 

 

 

 

 

 

1.2 Put the package installing file to R library path

Now, move the Tax4Fun2_1.1.5.tar.gz to library Path.

you can check where is your working directory with below commnad.

 

.libPaths()

 

 

 

 

 

1.3 Install package file with default

 

We should use abnormal install command line below.

install.packages("C:/Program Files/R/R-4.2.2/library/Tax4Fun2_1.1.5.tar.gz", repos = NULL, type="source")

My library directory was C:/Program Files/R/R-4.2.2/library, and install Tax4Fun2_1.1.5.tar.gz source file.

The type = "source" in command is telling about installing source file.

 

 

 

 

 

 

 

 

 

 

 

 

 

2. Run Package

library(Tax4Fun2)

If you got any error message, the package is installed well.

If you got some errors, you should find another installing methods.

 

 

 

 

 

 

 

 

 

 

 

 

3. Install Reference Data

buildReferenceData("path_to_work_directory", use_force = TRUE, install_suggested_packages = TRUE)

If you got error message with "Timeout of 60 seconds was reached",

follow below command lines.

 

getOption('timeout')

#[1] 60

options(timeout = 9999)

getOption('timeout')

#[1] 9999

 

buildReferenceData("path_to_work_directory", use_force = TRUE, install_suggested_packages = TRUE)


Finally, you can get this message. Have fun with Tax4Fun2!

 

 

 

 

 

 

 

 

 

 

 

 

 

 

4. Install Blast Program

 

4.1 Set Working directory

 

We should set working directory first, which is installed of Tax4Fun2_ReferenceData_v2

getwd() #Now working directory

setwd("path_to_work_directory") # Set working directory

getwd() #Confirm working directory

 

buildDependencies("Tax4Fun2_ReferenceData_v2", install_suggested_packages = TRUE) #Install

After this command, you can find "blast_bin" folder.

 

Now installation is complete.

 

 

 

 

 

 

 

 

 

 

 

 

 

5. Install Excel package

Unfortunately, one more package install left.

Don't worry. It's simple.

 

 

 

5.1 Install pacman

install.packages("pacman")

"pacman" is combined package that can use overall excel functions.

 

Now install is complete. I promise.

 

 

 

 

 

 

 

 

 

 

 

 

 

 

6. Formatting dataframe

 

6.1 #CSV to FASTA

pacman::p_load(tidyverse, openxlsx, installr, tcltk, xlsx, readxl) #load packages

setwd("path/to/working/directory") #set working directory

csv = read.csv("path/to/seq/file/SRR_seq.csv", header=T) # input sequence data

fa = character(2 * nrow(csv))

fa

fa[c(TRUE, FALSE)] = sprintf(">%s", csv$chr) #transform dataformat to fasta

fa[c(FALSE, TRUE)] = csv$seq #transform dataformat to fasta

Data format is attached in this posting named "Prevotella_seq"

First, you should change the otu or asv table to sequence format attached here.

Second, these commands will change .csv to .fasta format

  

 

 

 

 

 

6.2 Create fasta file

writeLines(fa, "SRR_seq.fna")

.fna will be created in your working directory.

 

 

 

 

 

 

 

 

 

 

 

7. Run Tax4Fun2

library(Tax4Fun2)

 

 

 

 

 

 

 

 

 

 

8. Functional analysis

 

8.1 Set Working directory

setwd("path/to/working/directory")

 

 

8.2 Functional Prediction Command

# 1. Run the reference blast

runRefBlast(path_to_otus = "SRR_seq.fna", path_to_reference_data = "F:/Analysis/2203_Feline/Tax4Fun2_ReferenceData_v2", path_to_temp_folder = "SRR", database_mode = "Ref99NR", use_force = T, num_threads = 6)

 

# 2. Predicting functional profiles **

makeFunctionalPrediction(path_to_otu_table = "SRR_Tax4Fun2_table.txt", path_to_reference_data = "F:/Analysis/2203_Feline/Tax4Fun2_ReferenceData_v2", path_to_temp_folder = "SRR", database_mode = "Ref99NR", normalize_by_copy_number = TRUE, min_identity_to_reference = 0.97, normalize_pathways = FALSE)

 

# 3. Calculating FRIs **

calculateFunctionalRedundancy(path_to_otu_table = "SRR_Tax4Fun2_table.txt", path_to_reference_data = "F:/Analysis/2203_Feline/Tax4Fun2_ReferenceData_v2", path_to_temp_folder = "SRR", database_mode = "Ref99NR", min_identity_to_reference = 0.97)

Here, i will explain what you should change commands.
 
#1...
path_to_otus = "File name.fna"
path_to_reference_data = "directory where reference data exist/Tax4Fun2_ReferenceData_v2"
path_to_temp_folder = "New folder name
 
#2...
path_to_otu_table = "table name.txt" #Data format is attached named "Prevotella_table.txt". You can check under excel program.
path_to_reference_data = "directory where reference data exist/Tax4Fun2_ReferenceData_v2"
path_to_temp_folder = "New folder name"
 
#3...
path_to_otu_table = "table name.txt" ##Data format is attached named "Prevotella_table.txt". You can check under excel program.path_to_reference_data = "directory where reference data exist/Tax4Fun2_ReferenceData_v2"
path_to_temp_folder = "New folder name"
 
 
 
 
 
 
 
The END.