Formatting Rbec output
Contains both results from hambiNMR experiment and also a test of using “Just A Plate” cleanup method during library preparation.
Setup
Libraries and global variables
Global vars
data_raw <- here::here("_data_raw", "20240513_BTK_illumina_v3")
data <- here::here("data", "20240513_BTK_illumina_v3")
amplicontar <- here::here(data_raw, "rbec_output.tar.gz")
# make processed data directory if it doesn't exist
fs::dir_create(data)
# create temporary location to decompress
tmpdir <- fs::file_temp()
Read data
Untar Rbec output tarball
Read and format Rbec tables
Read raw counts tables
Read normalized counts tables
Renormalizing to 16S copy number
rf1 <- straintabs %>%
group_by(sample) %>%
mutate(tot = sum(count)) %>%
left_join(normstraintabs) %>%
mutate(f = norm_count/sum(norm_count)) %>%
mutate(count_norm = round(tot*f)) %>%
dplyr::select(-count, -tot, -norm_count, -f) %>%
mutate(tot = sum(count_norm),
f = count_norm/sum(count_norm)) %>%
ungroup()
Joining with `by = join_by(sample, strainID)`
Read metadata
Rows: 456 Columns: 6
── Column specification ────────────────────────────────────────────────────────
Delimiter: "\t"
chr (3): sample, replicate, evolution
dbl (3): plate, day, amp_conc
ℹ Use `spec()` to retrieve the full column specification for this data.
ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
Final dataframe
Analysis
Negative controls
First, we’ll check whether the experimental and plate negative controls look good
full %>%
filter(str_detect(sample, "^Neg")) %>%
summarize(tot = sum(count_norm), .by = c("sample", "plate", "replicate"))
This looks great. In all cases very few reads are coming off the negative controls. This means it is safe to assume there was no contamination during the library preps. More importantly it looks like the main experiment was clean because we see no growth/contamination in the R2A media blanks.
Positive controls
full %>%
filter(str_detect(sample, "^Pos")) %>%
mutate(total_pos_controls = n_distinct(sample)) %>%
group_by(sample) %>%
mutate(n_sp_detected = n()) %>%
distinct(sample, n_sp_detected, total_pos_controls)
This is also good - we detect all 23 species in the positive controls on each plate
Export
Now we can write these files for later use