Type: | Package |
Title: | Compositional Statistical Framework for RNA Fractionation Analysis |
Version: | 1.0.0 |
Date: | 2025-10-16 |
Description: | A compositional statistical framework for absolute proportion estimation between fractions in RNA sequencing data. 'FracFixR' addresses the fundamental challenge in fractionated RNA-seq experiments where library preparation and sequencing depth obscure the original proportions of RNA fractions. It reconstructs original fraction proportions using non-negative linear regression, estimates the "lost" unrecoverable fraction, corrects individual transcript frequencies, and performs differential proportion testing between conditions. Supports any RNA fractionation protocol including polysome profiling, sub-cellular localization, and RNA-protein complex isolation. |
License: | CC BY 4.0 |
Encoding: | UTF-8 |
LazyData: | true |
Depends: | R (≥ 4.0.0) |
Imports: | future.apply (≥ 1.8.1), nnls (≥ 1.4), ggplot2 (≥ 3.3.5), dplyr (≥ 1.0.7), RColorBrewer (≥ 1.1-2), tidyr (≥ 1.1.3), matrixStats (≥ 0.60.0), aod (≥ 1.3.1), stats, utils, future, grDevices |
Suggests: | testthat (≥ 3.0.0), knitr, rmarkdown |
RoxygenNote: | 7.3.2 |
VignetteBuilder: | knitr |
NeedsCompilation: | no |
Packaged: | 2025-10-16 02:59:56 UTC; acleynen |
Author: | Alice Cleynen |
Maintainer: | Alice Cleynen <alice.cleynen@cnrs.fr> |
Repository: | CRAN |
Date/Publication: | 2025-10-21 17:40:02 UTC |
FracFixR: Fraction Correction Framework for RNA-seq Data
Description
A compositional statistical framework for absolute proportion estimation between fractions in RNA sequencing data. FracFixR addresses the fundamental challenge in fractionated RNA-seq experiments where library preparation and sequencing depth obscure the original proportions of RNA fractions.
Author(s)
Maintainer: Alice Cleynen alice.cleynen@cnrs.fr (ORCID)
Authors:
Agin Ravindran
Nikolay Shirokikh nikolay.shirokikh@uwa.edu.au (ORCID)
DiffPropTest: Statistical Testing for Differential Proportions
Description
Performs statistical testing to identify transcripts with significantly different proportions between conditions in specified fraction(s). Implements three test options: GLM (most powerful), Logit, and Wald.
Usage
DiffPropTest(NormObject, Conditions, Types, Test = c("GLM", "Logit", "Wald"))
Arguments
NormObject |
Output from FracFixR() function |
Conditions |
Character vector of exactly 2 conditions to compare |
Types |
Character vector of fraction type(s) to analyze. Can be single fraction or multiple (will be combined) |
Test |
Statistical test to use: "GLM", "Logit", or "Wald" |
Details
GLM: Uses binomial generalized linear model (most statistically powerful)
Logit: Faster alternative using logit transformation
Wald: Beta-binomial Wald test for overdispersed count data
Value
Data frame with columns:
transcript: transcript identifier
mean_success_cond1/2: mean proportions in each condition
mean_diff: difference in proportions
log2FC: log2 fold change
pval: p-value from statistical test
padj: FDR-adjusted p-value
Examples
data(example_counts)
data(example_annotation)
# Run FracFixR
results <- FracFixR(example_counts, example_annotation, parallel=FALSE)
# Run differential testing
diff_results <- DiffPropTest(results,
Conditions = c("Control", "Treatment"),
Types = "Heavy_Polysome",
Test = "GLM")
FracFixR: Main Function for Fraction Correction
Description
This is the core function that implements the FracFixR framework. It takes raw count data from total and fractionated samples and reconstructs the original fraction proportions through compositional modeling.
Usage
FracFixR(MatrixCounts, Annotation, parallel = TRUE)
Arguments
MatrixCounts |
A numeric matrix of raw transcript/gene counts with:
|
Annotation |
A data.frame with required columns:
|
parallel |
A boolean indicating whether to use parallel processing of the transcripts (default=TRUE). |
Details
The function works by:
Filtering transcripts based on presence in Total samples
For each condition and replicate, fitting NNLS regression
Estimating global fraction weights and individual transcript proportions
Calculating the "lost" unrecoverable fraction
Value
A list containing:
OriginalData: filtered input count matrix
Annotation: input annotation data
Propestimates: matrix of proportion estimates
Coefficients: data.frame of regression coefficients
Fractions: data.frame of estimated fraction proportions
plots: list of diagnostic plots
References
Cleynen et al. FracFixR: A compositional statistical framework for absolute proportion estimation between fractions in RNA sequencing data.
Examples
# Load example data
data(example_counts)
data(example_annotation)
# Run FracFixR
results <- FracFixR(example_counts, example_annotation, parallel=FALSE)
# View fraction proportions
print(results$Fractions)
PlotComparison: Create Volcano Plot for Differential Results
Description
Generates avolcano plot showing transcripts with significant differential proportions between conditions.
Usage
PlotComparison(DiffPropResult, Conditions = NULL, Types = NULL, cutoff = NULL)
Arguments
DiffPropResult |
Output from DiffPropTest() function |
Conditions |
Character vector of conditions being compared |
Types |
Character vector of fraction types analyzed |
cutoff |
Optional y-axis maximum for plot |
Value
Volcano plot-type object
Examples
data(example_counts)
data(example_annotation)
# Run FracFixR
results <- FracFixR(example_counts, example_annotation,parallel=FALSE)
# Run differential testing
diff_results <- DiffPropTest(results,
Conditions = c("Control", "Treatment"),
Types = "Heavy_Polysome",
Test = "GLM")
# Create volcano plot
volcano <- PlotComparison(diff_results,
Conditions = c("Control", "Treatment"),
Types = "Heavy_Polysome")
PlotFractions: Visualize Fraction Proportions
Description
Creates a stacked bar plot showing the distribution of RNA across fractions for each replicate, including the "lost" fraction.
Usage
PlotFractions(FracFixed)
Arguments
FracFixed |
Output from FracFixR() function |
Value
ggplot2 object showing fraction proportions
Examples
data(example_counts)
data(example_annotation)
# Run FracFixR
results <- FracFixR(example_counts, example_annotation, parallel=FALSE)
# Create fraction plot
frac_plot <- PlotFractions(results)
# Save plot with ggsave("fractions.pdf", frac_plot, width = 10, height = 8)
ProcessReplicate: Core NNLS Regression for Individual Replicates
Description
This function implements the mathematical core of FracFixR: fitting a non-negative least squares (NNLS) regression to estimate fraction weights and correct individual transcript abundances.
Usage
ProcessReplicate(RepMat, transcriptlist)
Arguments
RepMat |
Data frame with transcripts as rows, samples as columns. Must include a "Total" column representing the whole cell lysate |
transcriptlist |
Character vector of transcript IDs to use for regression. These should be informative transcripts in the appropriate abundance range |
Details
Mathematical basis:
Total = \alpha_0 + \alpha_1 \times Fraction1 + \alpha_2 \times Fraction2 + ... + \epsilon
Where \alpha_0
represents the "lost" fraction and other \alpha_i
are fraction weights
Value
List containing:
Propestimates: Proportion estimates for each transcript
Coefficients: NNLS regression coefficients (fraction weights)
Fractions: Normalized fraction proportions
plot: Diagnostic plot of fitted vs residuals
beta_binomial_wald: Beta-Binomial Wald Test
Description
Implements Wald test using beta-binomial distribution to account for overdispersion in count data. Useful when variance exceeds that expected under binomial distribution.
Usage
beta_binomial_wald(counts, successes, annotation)
Arguments
counts |
Total count matrix |
successes |
Success count matrix (not proportions) |
annotation |
Sample metadata |
Value
Data frame with test results for all transcripts
Example annotation data frame
Description
A data frame containing sample annotations for the example_counts matrix. Describes the experimental design with conditions, fraction types, and replicates.
Usage
example_annotation
Format
A data frame with 12 rows and 4 columns:
- Sample
Sample identifier matching column names in example_counts
- Condition
Experimental condition (Control or Treatment)
- Type
Fraction type (Total, Light_Polysome, or Heavy_Polysome)
- Replicate
Replicate identifier (Rep1 or Rep2)
Source
Simulated data generated for package examples
Examples
data(example_annotation)
head(example_annotation)
table(example_annotation$Condition, example_annotation$Type)
Example RNA-seq count matrix
Description
A matrix containing simulated RNA-seq counts for 100 genes across 12 samples. The data simulates a polysome profiling experiment with two conditions (Control and Treatment) and three fractions (Total, Light_Polysome, Heavy_Polysome).
Usage
example_counts
Format
A numeric matrix with 100 rows (genes) and 12 columns (samples):
- rows
Gene identifiers (Gene1 to Gene100)
- columns
Sample identifiers (Sample1 to Sample12)
Source
Simulated data generated for package examples
Examples
data(example_counts)
dim(example_counts)
head(example_counts[, 1:6])
extract_condition_matrix: Extract and Prepare Data for Statistical Testing
Description
Extracts count and proportion data for specified conditions and fraction types. Handles both single fraction and combined fraction analyses.
Usage
extract_condition_matrix(
originalcounts,
proportions,
annotation,
conditions,
types
)
Arguments
originalcounts |
Original count matrix |
proportions |
Proportion estimates from FracFixR |
annotation |
Sample annotation data frame |
conditions |
Vector of conditions to extract |
types |
Vector of fraction types to analyze |
Value
List containing:
counts: Total counts from whole cell samples
successes: Proportion data for specified fractions
annotation: Filtered and processed annotation
logit_diff_test: Logit-based Differential Test
Description
Alternative to GLM using logit transformation. Faster but potentially less powerful than the full GLM approach.
Usage
logit_diff_test(counts, successes, annotation)
Arguments
counts |
Total count matrix |
successes |
Proportion matrix |
annotation |
Sample metadata |
Value
Data frame with test results for all transcripts
Polysome profiling annotation example
Description
An alternative annotation data frame for polysome profiling experiments with monosome and polysome fractions.
Usage
polysome_annotation
Format
A data frame with 12 rows and 4 columns:
- Sample
Sample identifier
- Condition
Experimental condition (Control or Stress)
- Type
Fraction type (Total, Monosome, or Polysome)
- Replicate
Replicate identifier (Rep1 or Rep2)
Source
Simulated data generated for package examples
Examples
data(polysome_annotation)
head(polysome_annotation)
run_glm: Binomial GLM for Single Transcript
Description
Fits a binomial generalized linear model to test for differential proportions between conditions for a single transcript.
Usage
run_glm(transcript, counts, successes, sample_info)
Arguments
transcript |
Transcript identifier |
counts |
Total count matrix |
successes |
Proportion matrix |
sample_info |
Sample metadata with Condition factor |
Value
Data frame with test results for this transcript
Subcellular fractionation annotation example
Description
An annotation data frame for subcellular fractionation experiments with nuclear and cytoplasmic fractions.
Usage
subcellular_annotation
Format
A data frame with 12 rows and 4 columns:
- Sample
Sample identifier
- Condition
Experimental condition (WT or Mutant)
- Type
Fraction type (Total, Nuclear, or Cytoplasmic)
- Replicate
Replicate identifier (Rep1 or Rep2)
Source
Simulated data generated for package examples
Examples
data(subcellular_annotation)
head(subcellular_annotation)