saruparib

Transcriptome-wide identification of the RNA-binding landscape of the chromatin-associated protein PARP1 reveals functions in RNA biogenesis

Recent studies implicate Poly (ADP-ribose) polymerase 1 (PARP1) in alternative splicing regulation, and PARP1 may be an RNA-binding protein. However, detailed knowledge of RNA targets and the RNA-binding region for PARP1 are unknown. Here we report the first global study of PARP1–RNA interactions using PAR–CLIP in HeLa cells. We identified a largely overlapping set of 22 142 PARP1–RNA-binding peaks mapping to mRNAs, with 20 484 sites located in intronic regions. PARP1 preferentially bound RNA containing GC-rich sequences. Using a Bayesian model, we determined positional effects of PARP1 on regulated exon-skipping events: PARP1 binding upstream and downstream of the skipped exons generally promotes exon inclusion, whereas binding within the exon of interest and intronic regions closer to the skipped exon promotes exon skipping. Using truncation mutants, we show that removal of the Zn1Zn2 domain switches PARP1 from a DNA binder to an RNA binder. This study represents a first step into understanding the role of PARP1– RNA interaction. Continued identification and characterization of the functional interplay between PARPs and RNA may provide important insights into the role of PARPs in RNA regulation.

Introduction
Poly (ADP-ribose) polymerase 1 (PARP1) or ADP- ribosyl transferase 1, a multifunctional nuclear protein, belongs to the PARP family of proteins. PARP1 is responsible for initiation, elongation, and branching of ADP-ribose units from donor NAD+ molecules onto target proteins, a process known as PARylation. The major target for PARylation is PARP1 itself, but a number of other covalently PARylated proteins have been described, including histones, chromatin remo- deling proteins, and transcription factors. PARylation influences the activity of target proteins by modulatingprotein–nucleic acid interactions, enzymatic activity,protein–protein interactions, and/or subcellular localization.PARP1 was first characterized as a sensor for DNA breaks [1]. Besides its DNA damage response, PARP1 plays a crucial role in regulating numerous molecular processes, such as gene transcription and chromatin remodeling [2–4]. Some of the best functional examplesof PARP1 in gene regulation are its regulation ofchromatin structure by PARylating histones and destabilizing nucleosomes [5–7], its competition with H1 for specific target sites [8] and/or its direct interac- tion with transcription factors and cofactors, such as NF-κB or the nuclear factor to activate T-cell geneexpression [9–13]. PARP1 also plays critical roles incell division. For instance, PARP1 regulates compo-nents of the mitotic apparatus, such as centromeres and centrosomes, to control microtubule organization during mitosis and chromosome segregation [10].

Taken together, these studies show that PARP1exhibits a wide array of subcellular distributions, sug- gesting a broad and varied role for this protein [14, 15]. Although PARP1 has been implicated in multiple regulatory processes, one process for which the para- digm may change is its role in RNA biogenesis. First, PARP1 is known to PARylate poly (A) polymerase (PAP), inhibiting its polyadenylation activity [16], with consequences for pre-mRNA splicing regulation. Sec- ond, PARP1 binds to noncoding pRNAs to silence rDNA chromatin [17]. Third, PARP1 PARylates het- erogeneous nuclear ribonucleoproteins (hnRNPs), which play important roles in pre-mRNA splicing and translation regulation [18]. Fourth, we recently identi- fied PARP1 as an mRNA-binding protein [19, 20], providing further evidence that PARP1-/PARylation- mediated events function directly to control pre- mRNA processing. These findings serve to define PARP1 as a co-transcriptional splicing regulator [20]. One possible mechanism for this co-transcriptional function is that PARP1 acts as an adapter, bringing RNA close to chromatin [20]. In fact, a widespread association of chromatin-binding proteins with RNA was shown in vivo, supporting the idea of co-transcriptional RNA splicing [21].We previously identified PARP1 as a novel RNA- binding protein (RBP) using photoactivatable- ribonucleoside-enhanced crosslinking and immuno- precipitation (PAR–CLIP). This study raised the interesting possibility that PARP1 plays crucial roles inmany aspects of RNA processing to alter gene expression via regulation of mRNAs. Taken together, the identification and characterization of PARP1 − mRNA interactions may provide important insights into the role of PARP1 in mRNA regulation and subsequent human disease. However, the breadth, range, and functional location of mRNA types bound by PARP1 has not been explored. In order to identify the direct RNA targets and precise binding sites of PARP1 protein in vivo, we again applied PAR-CLIP followed by deep sequencing (PAR-CLIP-seq). This method is known for its precise identification of bind- ing sites resulting from T-to-C sequence conversionsupon RNA–protein crosslinking. We observed that PARP1 was predominantly crosslinked to mRNAs. PAR–CLIP-binding regions contained guanine–cyto- sine-rich sequences, and RNA–protein interaction was further confirmed by gel mobility-shift assays. Fur-thermore, we narrowed down the RNA-binding region of the PARP1 protein. The enrichment of many other mRNA-binding proteins (mRBPs) among the large number of PARP1–mRNA targets suggests that PARP1 has a broad role in the regulation of manygenes. A continuous identification and characterization of functional interplay between PARPs and RNA may provide important insights into the role of PARPs in RNA regulation.

Results
In our previous experiments we established for the first time that PARP1 binds to RNA in vivo [20]. In the present study, we expanded on previous studies [20, 22] to identify PARP1–RNA targets utilizing the PAR-CLIP-seq method [23–26] (Figure 1a) in human HeLacells. Following UV crosslinking, PARP1-boundRNAs were immunoprecipitated under stringent con- ditions. Radiolabeled PARP1-bound RNA complexes were separated by NuPAGE and observed using a Phosphorimager (Figure 1b). To ensure that only PARP1 protein-bound RNAs were used for further analysis, gels were transferred onto nitrocellulose membranes, visualized by autoradiography (Figure 1c), and the presence of PARP1-bound RNAs was confirmed by western blot analysis (Figure 1d). The results from these experiments demonstrate therobustness and specificity of the PARP1–RNA com- plexes identified by PAR-CLIP (Figure 1).In the analysis of the phosphorimages of the radi- olabeled PARP1–RNA complexes, we observed two major bands, one migrating at ~ 100 kDa and the other migrating at ~ 140 kDa. This ~ 140-kDa band based on the estimation from the protein standard we used isindeed PARP1 (Using other protein standards, this band runs according to PARP1’s predicted molecular weight of ~ 116 KDa—see Supplementary Materials and Methods). Indeed, this band was later confirmedby western blot analysis as PARP1-bound RNA (Figure 1d). Stringent digest with RNase T1 resolved the PARP1–RNA bands to within the estimated molecular weight of PARP1 ~ 140 kDa(Supplementary Figure S1a). The 100-kDa band con- tains cleaved PARP1 as identified with antibody that recognizes the c-terminal domain of PARP1 (data not shown).

In addition to these two bands, we alsoobserved signals from a higher molecular weight complex (4260 kDa), possibly due to larger complexes that did not migrate into the gel (Supplementary Figure 1a). We suspect that this band likely repre- sents other abundant RNA binding near PARP1-binding sites, or PARP1 crosslinked to longer target RNA segments [27] (Figure 1b–d). This interpretation is reasonable, given that similar trends have been observed with other RNA-binding proteins [28].Figure 1 Processing of PAR-CLIP RNA samples for sequencing. (a) Outline of experiments. PAR-CLIP of endogenous PARP1 was performed on HeLa cells using 4-thiouridine (4SU). (b) PAR-CLIP samples run on an SDS PAGE gel were imaged with Typhoon. (c) Protein-bound samples were transferred onto a nitrocellulose membrane and exposed to a phosphoimager screenand imaged using the Typhoon. (d) The same membrane in a was probed for the presence of PARP1-bound RNAs using an antibody to PARP1. Red arrows (b–d) indicate PARP1 bound-RNA—one at ~ 140 kDa and a shorter fragment at ~ 100 kDa. As the antibody recognizes the larger fragment, we considered this as the full-length protein. The lower band could be a proteolyticfragment as determined by mass spectrometry, lacking the N terminus, and rendering it undetectable by the antibody raised against the N-terminal domain of the protein. (e) Processing of PARP1-bound RNAs for sequencing. A representative denaturing (8 M urea) polyacrylamide gel showing the different steps of adapter ligation to RNA samples. PARP1-bound RNAs were eluted from membrane in c, deproteinized, ligated to 3′ and 5′ adapters (lane 4). Lanes 1 and 6 are control 19-mer and 24-mer labeledRNAs, ligated to 3′ adapter. These ligated 3′ adapter control RNAs were further ligated to 5′ adapters (lanes 2 and 7). The greenarrow indicates unligated 19-mer and 24-mer RNAs (Lanes 1 and 6, respectively); the blue arrow indicates 3′-adapter ligated control RNAs; the red arrow indicates 3′ adapter and 5′-adapter ligated control RNAs. These controls were used to test the ligation efficiency of our samples.

The black arrow (lane 4) indicates 3′ and 5′ adapter ligated PARP1-bound RNA samples. (f) Adapter-ligated samples were subjected to limited PCR amplification. Lanes 1 and 2 show 3′ and 5′ ligated 19-mer and 24-mer control RNAs converted to cDNA and PCR-amplified. Lanes 3 and 4 are the PARP1-bound RNAs subjected to cDNA conversion and PCR amplification. The black arrow shows the PCR products used for sequencing.To validate the specificity of PARP1–RNA binding, we performed several control experiments. (1) A con- trol PAR-CLIP experiment using nonspecificantibodies (IgG) to precipitate RNA complexes failed to detect any RNA (not shown). (2) Cells not treated with thiouridine or non-crosslinked cells failed toimmunoprecipitate a significant amount of PARP1- bound RNA (Supplementary Figure S1b, lanes 2 and 3, respectively), although PARP1 protein remained efficiently precipitated as determined by western blot analysis of the immunoprecipitated complexes (Supplementary Figure S1c, lane 5, bottom; Supplementary Figure 1d, Lanes 3 and 5, bottom). (3) Experiments with stringent RNaseA treatments elimi-nated the PARP1–RNA bands (Supplementary Figure S1b, Lanes 6 and 7, SupplementaryFigure S1e). (4) Knockdown of PARP1 abolished the PARP1–RNA band (Supplementary Figure S1c, lane 6). (5) Lastly, treatment of cells with PJ34 (PARylation inhibitor) for 1 or 24 h did not change the PARP1– RNA-binding profile (Supplementary Figure S1c, lanes3 and 4), suggesting that this binding is specific for PARP1 and not PAR.After confirming PARP1–RNA binding, the PARP1–RNA complexes were cut from the mem- brane, eluted, deproteinized, purified, and ligated toadapters (Figure 1e). The resulting ligated RNAs were converted to cDNA followed by limited PCR amplifi- cation experiments (Figure 1f and Supplementary Figure S2a).

Initially, these PCR fragments were cloned into TOPO-blunt vector, checked for correct insert size by restriction enzyme digest (Supplementary Figure S2b), and Sanger-sequenced. From these pilotexperiments, the mean fragment length was 21 nucleotides (from the main 140 kDa PARP1–RNA band), 31 nucleotides (from the 200 kDa–PARP1– RNA fragment), and 7 nucleotides from (70 kDa– PARP1–RNA band; Supplementary Figure S2c). For subsequent studies, only the bands resulting from themain PARP1-bound RNA bands (~140 kDa) were used. Seven biological replica experiments were per- formed, barcoded, and pooled for sequencing using PE Illumina sequencing on a HiSeq 2500. From thevarious biological replicate experiments, we obtained 0.9–97 × 106 reads after sequencing (Supplementary Table S1). These sequences were subsequently trim- med from adapter sequences yielding a total of 0.6– 39 × 106 unique reads, 47% of which mapped to the human genome (hg38) allowing 0–2 mismatches.Next, we grouped them by overlaps using thePARalyzer software [29]. The identified segments of RNA represented peaks of T-to-C conversion (binding sites), with a mean length of 21 nucleotides (mean and mode of 20 nt) from uniquely aligned T-to-C reconciled reads (Figure 2a and Table 1). Groups of overlapping PAR-CLIP sequence reads were considered bindingsites if they (1) passed thresholds of ≥ 0.25 for T-to-C conversion frequency, (2) contained more than fivereads with T-to-C conversion (one mismatch maximum allowed per read), and (3) showed at least two inde- pendent T-to-C conversions. Biological replicates, although with different sequencing depth, showed similar binding patterns (Supplementary Figure S3).To identify PARP1–RNA target sites, we analyzed the distribution of PAR-CLIP tags in the human gen- ome by defining six regions (exon, introns, promoter, 5′ UTR, 3′ UTR, and intergenic regions). The distribu- tion of binding sites across individual transcripts pro-vided insights into PARP1 targeting. Approximately 48% of PAR-CLIP peak tags (see Materials and Methods) mapped to introns, ~ 8% mapped to exons, 2% to promoter regions, 2% to 3′ UTR, 1% to 5′ UTR, and 39% mapped to intergenic regions (Figure 2b).

Theover-representation of intronic PAR-CLIP reads indi- cates that PARP1 binds pre-mRNAs (nascent tran- scripts) and is consistent with our hypothesis that PARP1 plays a role in pre-mRNA splicing and pro- cessing. On the other hand, the observation of a high percentage of PARP1-PAR-CLIP reads to intergenic regions suggests the possibility that these PAR-CLIP tags may correspond to previously unidentified iso- forms of genes with alternative terminal exons. To test this idea, we carried out two types of analyses. First, we examined the distance between intergenic clusters and neighboring RefSeq genes. An exponential increase in the cumulative number of tags within 10 kb down- stream of known stop codons compared to linear increases beyond 10 kb was detected. For instance, 39% of these intergenic peaks mapped within 10 kb of the nearest stop or start codon, respectively (Figure 2c and Supplementary Table S2). This suggests that inaddition to binding known 3′ UTRs (Figure 2b), PARP1 binds to unannotated 3′ UTR extensions of known genes (Supplementary Figure S4). Second, weasked whether the remaining intergenic reads map togenes annotated in other reference genomes, as deter- mined from the ‘RefSeq Other’ track in the UCSC genome browser. We observed that 8% mapped to genes annotated within other RefSeq genomes. These analyses show that by doing a more detailed analyses only ~ 18% (45% of the initial 39% intergenic reads as shown in Figure 2b) of PARP1-PAR-CLIP tags map to intergenic regions (Figure 2c and Supplementary Figure S4).Next, we analyzed the distribution of PARP1-PAR- CLIP reads in coding regions. This analysis showed that ~ 78% of the reads mapped to introns (Figure 2d), raising the possibility that PARP1 contributes to the recognition of specific intronically encoded RNAs such as mRNAs, microRNAs, small nuclear RNAs, and heterogenous RNAs, and influences the rates of var- ious competing RNA processing steps. To examine this, we analyzed the types of RNAs bound by PARP1 from the PAR-CLIP data.

Our analyses show that most of the PAR-CLIP peaks were within mRNAs (88%) compared to the other RNA types, demon- strating that mRNA is the major substrate of thePARP1–RNA complex. On the other hand, crosslink sites were also detected in different classes of RNAs:2 870 peaks (or 11% of total RNAs bound) in long intergenic noncoding RNAs, 124 peaks (or 1% of the total RNAs bound) in microRNAs, and 88 peaks within small nuclear RNAs (Figure 2e and Supplementary Table S3). These results suggest possi- ble novel functions for PARP1 in the regulation of the metabolism of other RNAs as well.As an alternative method to validate these binding sites, we performed formaldehyde-crosslink RNA immunoprecipitation with nuclear extracts [30]. Enrichment of candidate RNAs was similarly observed using this method (Supplementary Figure S5). Com- bined, these data support the specificity of PARP1- PAR-CLIP-seq and suggest that our observed inter- actions are indeed interactions between PARP1 and RNA.We next asked whether PARP1 binds to a particular RNA sequence motif. For that, we applied cERMIT[31] to define the in vivo RNA recognition element for PARP1. The three highest-scoring motifs were gen- erally GC-rich (Figure 3a); this nucleotide composition was observed regardless of the mRNA region of the identified PAR-CLIP tags (Figure 3a). Failure to determine a highly conserved binding motif promptedus to use an unbiased k-mer approach to determine the enrichment of specific sequences within PAR-CLIP data.

For this, the 2-nt PARP1-PAR-CLIP data set surrounding the crosslink sites was compared to the genome as a whole to identify k-mers enriched in PARP1-PAR-CLIP reads. Our choice of k-mers allowed us to detect smaller localized signals than cERMIT, which begins with 5-mer seed regions. Starting with 3-mers, we observed an enrichment of GC-rich 3-mers (data not shown). However, as RNA recognition elements are typically longer than 3-mers, we performed further analyses using 4-mers. Again, this analysis showed an enrichment of GC-rich 4-mers (Figure 3b), whereas AT-rich 4-mers were depleted (Figure 3c and d). We repeated the analyses with 6- mers and 8-mers, and clearly the enriched k-mers were GC-rich k-mers, although these longer GC-rich are interspersed by AT k-mers (Supplementary Figure S6). Our data show that PARP1 protein RNA-binding sites were comparatively GC-rich, suggesting a tolerance for these GC-rich residues, whereas AT-rich residues were relatively less well tolerated. This information is of interest as during PAR-CLIP experiments G-contain- ing sequences are normally trimmed by RNase T1, and the only way for these guanosines to survive this clea- vage is if they are protected by direct binding of the PARP1 or by stable RNA secondary structure [32]. Our results therefore suggest that PARP1 binds to GC- rich regions and protects these G-rich regions from RNase T1 cleavage.To test whether transcripts bound by PARP1 are affected upon PARP1 depletion, we determined the global patterns of PARP1-dependent transcription/ splicing changes. For this, cells were transfected with ONTARGETplus short interfering RNA (siRNA) targeting PARP1 and for control experiments with non-targeting siRNAs. Depletion of PARP1 protein was confirmed by western blot analyses, which showed an ~ 70% reduction in PARP1 protein levels in the knockdown cells (Figure 4a). Total RNA was isolated from control non-targeting siRNA and PARP1 knockdown (KD) cells, and poly(A)-selected mRNA sequencing was performed on the Illumina platform. Biological replicas from RNA-seq showed high Pear- son correlation (Supplementary Table S4), allowing pooling of samples for further analyses. First, we measured changes in gene expression at the transcript level due to PARP1 knockdown.

We identified 217 significantly upregulated and 81 downregulated genes,including PARP1 (using a cutoff of twofold expression and P-value of 0.05 versus non-targeting control; Supplementary Table S5). GO analysis using Gene Set enrichment analysis (GSEA) showed that the top bio- logical processes targeted by the genes upregulated in PARP1 KD cells are NMD, translation, protein metabolism, selanocysteine synthesis, and gene expression. Genes that were downregulated in PARP1 knockdown cells are involved in RNA-binding and poly-A-RNA-binding using GSEA (Figure 4b). We next compared PARP1 RNA targets to genes affected by PARP1 knockdown, and did not observe any meaningful correlation between genes that were bound and trends in gene expression changes. Nevertheless, we observed that ~ 29% of genes transcripts affected by PARP1 knockdown were also bound by PARP1 in our PAR-CLIP analysis (Figure 4c).Our previous study in Drosophila cells suggested thatPARP1 plays a role in alternative splicing regulation [20]. In order to assess the effect of PARP1 in splicing, we also analyzed the RNA-seq data for differential alternative splicing events. Using stringent criteria to identify changes in alternative splicing events, we showed that PARP1 depletion resulted in changes in alternative splicing for 791 genes. These changed events included mutually exclusive exons (42.4%), skippedexon (25.6%), retained intron (4.2%), alternative 5′ splice site (23.5%), and alternative 3′ splice site (4.4%; Figure 4d). We validated some of these changes inalternative splicing due to PARP1 depletion using qRT-PCR (Supplementary Figure S7).

The number of alternatively spliced genes are slightly lower than those observed in our previous studies with Drosophila, where we observed many more changes in alternative splicing [20]. We attribute this low number to possible redundancy with other PARP proteins in humans. GO molecular function terms as determined using GSEA for the targeted alternative spliced genes include nucleosome binding, Poly-A-binding, and RNA bind- ing (Figure 4e).Positional effects of PARP1 in splicing regulationTo extend the analysis of the role of PARP1 in alternative splicing, we averaged the presence of PARP1 PAR-CLIP reads along all exon/intron and intron/exon boundaries, representing 3′ and 5′ splice sites, respectively. PARP1 binds uniformly withinintrons, whereas its binding is enriched at the ends of exons—specifically within 50 nucleotides upstream of start of the exon and 50 nucleotides downstream of the end of the exon (Figure 5a). The observed exonbias reflects the distribution of binding sequenceswithin target RNAs and suggests that PARP1 binds mRNA. Although we had observed PARP1 PAR- CLIP reads in introns (Figure 2b), the density of these reads at exon–intron boundaries suggests a functional role of PARP1 in demarcating exons. Thus, thebinding of PARP1 preferentially at exonic sequences,especially upstream of 5′ and 3′ of splice sites, is consistent with the model that proteins thatregulate splicing bind pre-mRNA at functional regions.To better understand the impact of PARP1 in spli- cing, we combined PAR-CLIP data with the analysis ofsplicing profiles upon PARP1 depletion to determine the position-dependent regulatory effects of PARP1– RNA interactions. To this end, we analyzed the rMATS outputs for skipped exon events using thebioinformatics software rMAPS [33], which system- atically generates RNA maps for the identification of position-dependent effects of RNA-binding proteins. The rMAPS program is extremely useful for the com- putational detection of binding sites around differential alternative splicing events for over 100 of known RBPs. Using the rMAPS-based analysis (with default para- meters), along with the list of all PARP1 PAR-CLIP peaks and detected skipped exon events, we identified binding patterns of PARP1 within the PARP1-depen- dent alternatively spliced exons (Figure 5b).

Restricting the analyses to only significant exon-skipping splice events, we found that for those enhanced and included exons, there is a significant PARP1 binding occurringabout 125 bp downstream of the adjacent 5′ exon, and binding occurring about 250 bp upstream of the adja- cent 3′ exon (peaks in red). If the exon is excluded, there is a significant binding of PARP1 within the exonitself (in blue) as well as within the upstream and downstream introns. Although it is possible that fac- tors related to translational efficiency and/or RNA stability may affect the regulatory landscape of PARP1-responsive splicing events, the differential expression of the PARP1 together with the enrichment of PARP1-binding and its positional enrichment rela- tive to the regulated exons suggests that many or most of the identified skipped exon splicing events are likely direct targets.Biochemical characterization of PARP1 protein–RNA- binding sitesPARP1 encompasses several functional domains: three zinc-finger domains (Zn1, 2, and 3), a nuclear localization signal region, a breast cancer suppressor protein-1 domain (BRCT), a WGR domain (auto- modification domain), and the catalytic PARP domain (Figure 6a). To begin to understand PARP1-RNA binding, we purified recombinant full-length human PARP1 (PARP1-FL) and truncated mutants lackingthe C-terminal catalytic active site (ΔCAT), the DBD—the first two zinc fingers (ΔZn1Zn2); the third zinc- finger domain (ΔZn3), the automodification domain (ΔWGR), or the protein–protein interaction domain (ΔBRCT domain) from bacterial cells (Figure 6b). Their presence was confirmed through western blot analyses using PARP1 antibody (lanes 1–6, respec- tively; Figure 6c) and their proper folding confirmedusing circular dichroism spectroscopy analyses(Supplementary Figure S8a). We addressed whether PARP1–RNA direct binding is dependent on other factors, such as contaminating DNA and/or PARP1 PARylation. First, recombinant PARP1-FL wasincubated with a radiolabeled synthetic 19-mer ssRNA (chrom15: 53554024-53554044) corresponding to one of the binding sites identified by PAR–CLIP. Theprotein–RNA complexes were then resolved on anative polyacrylamide gel (Figure 6d andSupplementary Figure S8b).

A supershift correspond- ing to PARP1–RNA complex was observed (Supplementary Figure S8b, lane 2). Second, the PARP1–RNA complex was treated either DNase1 or RNaseA, confirming that RNA is the nucleotide spe-cies bound by PARP1 as DNase1 treatment did not change the binding profile but RNaseA completely digested the RNA (Supplementary Figure S8b, lanes 5 and 6, respectively). In addition, treatment of PARP1 with PJ34 did not inhibit PARP1 binding to RNA (Supplementary Figure S8b, lane 4), whereas PAR- ylation of PARP1 by NAD+ abolished its RNA-binding (Supplementary Figure 8b, Lane 3), indicat- ing that PARP1–RNA binding is due to PARP1 and not PAR. As a control, RNA was incubated with increasing amounts of bovine serum albumin and no significant shift in RNA mobility was observed (datanot shown).We next asked which domain of PARP1 is required for its PARP1–RNA binding. EMSA was performed using PARP1-FL as well as truncated mutants by individually incubating them with the radiolabeled synthetic 19 mer RNA (as above; Figure 6d). As seenpreviously, discrete shifted bands corresponding to PARP1–RNA complexes were observed for all the proteins tested. We then determined the binding affi- nities of PARP1-FL and mutants to RNA by per-forming EMSA, incubating 0.05 μM radiolabeled 19-nt RNA with increasing concentrations (0–2.5 μM) of PARP1-FL or truncated proteins (Figure 7a–f). The fraction bound to total RNA as a function of increasedprotein concentration for each protein was used to calculate the affinity of that particular protein for RNA (Figure 7g and Supplementary Figure S9). Interest- ingly, these proteins bind with different stoichiometry, and this difference in binding stoichiometry was taken into account when calculating the affinity constants − Kassoc (Table 2). These results show only a two- to threefold difference in affinity to RNA betweenthe PARP1 proteins—with PARP1-FL having the highest affinity, whereas ΔZn3 showed the lowest affi- nity, followed by ΔZn1ΔZn2 (Table 2). These data are in line with previous studies that showed that PARP1 bindsRNA via its zinc-finger 3 domain [34]. Interestingly, deletion of another region previously implicated in binding RNA (WGR) did not significantly change the affinity from that of the PARP1-FL. Similar binding affinity results were obtained using RNAs of different lengths (20 and 24 nt; Supplementary Table S6).

At first surprising, similar small differences in affinity have also been recorded for the binding of these constructs to DNA [35], although PARP1 is a well-known DNA- binding protein. These previous results hypothesized that all the domains of PARP1 contribute to its DNA- binding interactions. We believe that a similar scenario is occurring with PARP1 binding to RNA.Following on these results, we examined the possibi- lity that RNA activates PARP1 and showed that, just like DNA, RNA activates PARP1, albeit at a lower extent (Supplementary Figure S10). Finally, we per- formed a competition assay to test whether PARP1 preferentially binds DNA to RNA. Equal concentration of radiolabeled 19-mer RNA and radiolabeled ssDNA of the same sequence was incubated together with increasing concentrations of the different PARP1constructs. As the ssRNA and ssDNA of the same sequence run with different gel mobility, it allowed us to quantify the disappearance of the RNA and DNA in the presence of these recombinant PARP1 proteins. This analysis revealed that PARP1-FL had a 25-fold affinity to DNA than RNA (Figure 8a for PARP1-FL). A similar result was observed with the other constructs(Table 3) except for the ΔZn1ΔZn2 mutant. This mutant switched PARP1’s binding preference from DNA to RNA, with a sevenfold preference for RNA to DNA(Figure 8b and Table 3). These results indicate that, once the Zn1Zn2 site is unavailable, PARP1 preferentially binds RNA and suggest that the DNA binding is dif- ferent from the site needed to bind RNA.

Discussion
The transcriptome analysis performed here by high- throughput PAR-CLIP sequencing provides new insights into the endogenous RNA targets of PARP1. We found that PARP1 binds RNA in vivo (Figure 1).We also observed that, whereas the main target of PARP1-RNA binding in vivo is mRNA, it also binds other non-coding RNAs (Figure 2), suggestive of a functional role of PARP1 in their regulation. Within mRNAs, we find that PARP1 associates mainly withintronic sequences (Figure 2). However, since introns are very long and PARP1–RNA targets could target different regions of a particular intron, we also ana- lyzed the density of the reads at functional splice sites. Our results show that there is a high density of PARP1–RNA binding at exon–intron boundaries and intron-exon boundaries (Figure 5). These results could suggestthat PARP1 demarcates exons. Interestingly, we pre- viously had showed that PARP1 binds GC-rich nucleosomes at exon boundaries [20]. It is therefore logical to assume that it binds to similar regions on chromatin as well as on RNA, possibly by recognizingspecific sequences or structures on DNA and/or RNA. However, additional studies are needed to determine the structural implications of PARP1 binding. We further combined the PAR-CLIP-seq analysis with full transcriptome-wide analysis of gene expression and splicing changes upon PARP1 depletion. Combining PAR-CLIP and RNA-seq data allowed us to draw a PARP1 RNA map, which suggested that the binding of PARP1 on exons and in intronic regions immediately surrounding the regulated skipped exon leads to silen- cing of the downstream exon. PARP1 binding to introns further upstream and downstream of the skip- ped exon enhances exon inclusion (Figure 5b).

The high distribution of PARP1 in introns (Figure 2) enhances the idea of a regulatory role of PARP1 in splicing, as intronic-binding proteins such as HNRNPU [36], HNRNPH1 [37], and HUR [38] havebeen implicated in splicing decisions. Under this sce- nario, the binding of PARP1 to intronic sequences mediates splicing; however, it can also remain asso- ciated with the mature mRNAs to help in other post- transcriptional mRNA processes. This seems to beoccurring, as we observe a high PARP1 PAR-CLIP read density, at the ends of exons (exon–intron and intron–exon boundaries depicting 3′ and 5′ splice sites, respectively), and is in line with other intron-bindingproteins [39]. Noteworthy is the fact that proteins that bind at exons interact with the RNA after transcription and initial RNA processing, whereas the intron binders are present during transcription [21], thus supporting their role in co-transcriptional splicing. However, because of the low CLIP efficiency (only ~ 1% of transcripts are crosslinked), it is difficult to distinguishwhether the PARP1–RNA interactions are on pre-mRNA transcripts or whether a subset of these mRNAs is subsequently processed (in either alternative exons or poly (A) sites). On the other hand, our RNA map of PARP1 binding (Figure 5b) provides a func- tional landscape of significantly skipped alternative splicing regulation by PARP1 that can be used in future studies to further characterize the regulation of AS by PARP1. PARP1 could be modulating splicing decisions through two mutually non-exclusive mechanisms: (i) maintaining a chromatin structure that affects RNA polymerase kinetics and/or (ii) recruiting and PARylating splicing factors to splice sites on nascent mRNAs while bound onto chromatin. PARP1 has been implicated in many cellular pro- cesses. In this study, we focused on the observation that PARP1 is involved in splicing regulation [20]. The means by which PARP1 regulates alternative splicing isstill unknown. Earlier understanding of gene expres- sion regulation suggested that DNA-binding proteins responded to sequence composition and chromatin context to promote transcription of RNA [40, 41]. RNA-binding proteins (RBPs) then bind these nascent transcripts to direct mRNA splicing, stability, locali- zation, and translation [42, 43].

However, recentadvances profiling nucleic acid–protein interactions that many DNA-binding proteins also associatewith RNA to modulate both transcriptional and post- transcriptional outcomes [19, 44–46], blurring this long-standing dogma for gene regulation. The resultspresented here also find that PARP1, a well-known DNA/chromatin-binding factor, binds RNA, adding to this growing list of proteins interacting with both DNA and RNA to affect gene regulation. Our study further suggests that PARP1 binding to RNA may regulate gene splicing and/or generally different levels of RNA biogenesis. Collectively, these studies suggest a more intertwined gene regulatory network (transcrip- tion and splicing) than had been previously appreciated.Indeed, it is now known that splicing is tightly integrated with gene expression [47, 48], with splicing controlling gene expression via nonsense-mediated [49] or spliceosome-mediated [50] decay pathways. Unspliced and partially spliced transcripts can be deleterious for the cell [51, 52] and several quality- control pathways exist to degrade these faulty tran- scripts. The first and main line of protection (degra- dation of these faulty transcripts) is through the nuclearexosome process [53–55]. If this fails, a second line of defense occurs via cytoplasmic surveillance pathways[52], leading to cytoplasmic degradation. This can be triggered in two ways—the nonsense-mediated decay (NMD) pathway that recognizes premature stop codons [56, 57] or by the non-stop decay pathway thatidentifies transcripts lacking stop codons [58]. Inter- estingly, PARP1 depletion led to an upregulation in the expression of transcripts for protein products involved in the NMD pathway, and a decrease in transcripts ofproteins involved in poly-A-RNA binding, showing a clear intersection of PARP1 in RNA biogenesis. Sev- eral studies implicate PARP1 in several steps of RNA biogenesis such as RNA metabolism [59], mRNA metabolism, and protein synthesis [3, 60].

Further- more, splicing factor 3A subunit 1, splicing factor 3B subunit 1, splicing factor 3B subunit 2 [61], and alternative-splicing factor 1/splicing factor 3 [62] are either targets of poly(ADP-ribosyl)ation or bind directly to PARP1. The function of poly (ADP-ribose) binding, the binding to PARPs, and ADP ribosylation of these splicing factors is not well understood.In these studies, we show that PARP1, a known DNA-binding protein, binds RNA both in vivo (Figure 1 and Supplementary Figure S1) and in vitro (Figure 7). Our forward competition assays of PARP1 binding to DNA and RNA showed that, in the absence of the Zn1Zn2 domain, PARP1 preferred binding to RNA than to DNA (Figure 8). These results are con-sistent with our idea of PARP1’s role in co- transcriptional splicing [20], where PARP1 binds tochromatin using the Zn1Zn2 domain, and when that site is used it still has the ability to bind to nascent mRNA through another domain. Does PARP1 recognize a specific RNA motif? Previous studies showed that PARP1 binds the DNA motif, AGGCC [63], and/or binds to the vicinity of the DNA motif, GGAAGG [64]. In our analysis, we failed to find an enriched RNA motif for PARP1 binding; we did, however, find that PARP1 binds to RNA sequences enriched in GC-rich sequences (Figure 3 and Supplementary Figure S6). It is tempting to speculate that in binding to these GC-rich sequences PARP1 recognizes a structure formed by these sequences. One such structure is formed by G-quadruplexes, which have also been implicated in splicing regulation. Infact, PARP1 binds G-quadruplexes in vivo [65–67]. However, additional studies will be needed to testwhether PARP1 RNA targets form structures such as the G-quadruplex.

Our results showing that deletion of the third zinc finger of PARP1 resulted in the lowest affinity of this mutant protein for RNA (Figure 7 and Table 2) sup- port the idea that PARP1 uses its Zn3 to bind RNA in vitro [34] or pRNA [17]. The small difference in affinity between PARP1-FL and its truncation mutants could imply that either: (i) all regions contribute to RNA binding or (ii) as yet, there is an undiscovered RNA-binding region of PARP1. These possibilities are not far-fetched since other PARPs lacking of some of the domains of PARP1 bind to RNA. For instance, PARP12 and PARP13 bind RNA through its zincfingers, whereas PARP14 and PARP10 have possible RRMs present on different protein domains [68]. In addition, PARP7, which lacks these zinc-finger domains, still binds RNA [69]. As of now, it is not clear whether there is an RNA recognition motif on PARP1, although in addition to the zinc-finger 3 domain the WGR domain can also bind RNA [34]. Future studies will be critical to determine the exact RNA recognition motif of PARP1.In light of PARP1’s in vivo binding to RNA, its effect on splicing, and its importance in the regulationof transcript expression of some of the proteins important for NMD and poly-A binding, it is provo- cative and highly suggestive to hypothesize that PARP1 is a protein involved in genome surveillance. This hypothesis seems plausible if one considers its role in DNA repair; whereas PARP1 does not execute therepair itself, it binds to the site of damage and recruits repair proteins to the site of repair [1]. Furthermore, in transcription regulation, it stalls polymerase elongation [6, 70], thereby possibly allowing proper genome sur- veillance. Once surveillance is complete, in the absence of any DNA damage, it then PARylates histones, releasing the repression on polymerase elongation [5, 6]. We believe that this is also a likely scenario in splicing. PARP1 by itself does not splice, but binds to specific splice sites [20] (Figure 5), possibly recruiting/ activating splice factors to that region. Although recruitment of splice factors has not been shown, PARP1 PARylates and activates splicing factors [62]. In addition, this idea is also further bolstered when oneconsiders its functions at the 3′ ends of mRNA where PARP1 PARylates poly-A binding protein (PAP), thusdecreasing the ability of the modified PAP to bindRNA.

This PARylation effect has also been shown for several other 3′ processing factors such as PABPN1 and all CPSF subunits [59], pointing to the possibility that PARP1 might be a general regulator of 3′ pro- cessing. Lastly, the study of PARP1 under differentscenarios has probably led to the idea that it acts in so many functions; however, it is also tempting to spec- ulate that it acts generally as surveillance molecule that ensures genome stability.Our understanding of the role of PARPs and PAR in transcriptional and post-transcriptional regulation of gene expression through modulation of RNA is still in its early stages. Our studies, however, provide a very useful platform to begin to tease, uncover, and decipherPARP1’s role in the many steps in RNA biogenesis.HeLa cells were used for PAR-CLIP experiments. Cells were grown at 37 °C in a humidified environment containing 5% CO2 and 95% air in Dulbecco’s modified Eagle’s medium (Sigma) containing 1 mM sodium pyruvate, 0.1 mM nonessential amino acids, and supplemented with 10% fetal bovine serum, 100 U ml− 1 penicillin, and 100 μg/ml streptomycin. For each experiment, ~ 6 × 108 cells (~ 60 ×15 cm cell culture plates) were used.Cells were cultured to 80–90% confluency, and then treated overnight with 4-thiouridine to a final concentration of 100 μM added directly to the cell culture medium. Cells were washedwith ice-cold phosphate-buffered saline (PBS), the liquid was aspirated and the plates placed over ice and then irradiated with UV light at 365 nm (150 mJ cm−2). Cells were then scraped off the plates and collected by centrifuging at 2000 r.p.m. for 10 min.PAR-CLIP was performed as previously described [26] with some modifications. Briefly, 10 ml of packed cell pellet-UV- treated cells were lysed with 3 volumes of 1 × NP40 lysis buffer on ice for 5 min. Cells were pelleted by centrifugation at 18 000g for 15 min using an Eppendorf 5810R centrifuge 5810R with anA-4-81swing bucket rotor.

The supernatant was filtered using a 5 μM syringe filter (Sterile Acrodisc Syringe Filters with Supor Membrane; Ann Arbor, MI 48103 USA) to remove cellulardebris. The filtrate was partially treated with RNase T1 (Roche, Pleasanton, CA, USA) to a final concentration of 1 U μl− 1 for 15 min. The RNase-treated supernatant was then incubated for 2 h with 600 μl of protein A dynabeads (Invitrogen, Thermo fisher Scientific, Waltham, MA,USA) bound to 15 μg of anti- PARP1 antibody (Active Motif, Carlsbad, CA, USA) or controlIgG antibody. The beads were washed three times and the immunoprecipitated RNA was digested again with RNase T1 to a final concentration of 63 U μl− 1 for 15 min. After depho- sphorylation, the RNA segments crosslinked to PARP1 were 5′- radiolabeled using γ-32P-ATP and T4 polynucleotide kinase (Promega Madison, WI, USA) in one original bead volume.After several washes, each CLIP sample (on the beads) was then treated with 5 U of DNase1 (NEB Ipswich, MA, USA) for every 100 μl of bead volume for 15 min at 37 °C. DNase1 was inacti- vated by adding 5 mM EDTA and heated at 65 °C for 10 min. Samples were then resuspended in SDS-PAGE loading buffer,incubated at 95 °C for 5 min to denature, and the PARP1-RNA crosslinks were release. The samples were then separated on 4– 12% NuPAGE gels (Invitrogen) and transferred onto nitro-cellulose membranes (1/10th of the sample was used for immu- noblotting and the rest of the sample was used for autoradiography).

The gel containing 1/10th of the sample and the membrane containing 9/10th of the sample were exposed to a phosphorimager screen overnight and visualized by scanning ona Typhoon FLA 9500. PARP1–RNA complexes were cut fromthe membrane, treated with proteinase K (Roche), followed by Phenol/Chloroform/IAA extractions and ethanol precipitation. The recovered RNA was used for cDNA library preparation.For this purpose we used NEBNext Multiplex Small RNA Library Prep Set for Illumina (Set 1). Library preparationincluding 3′ and 5′ SR Adaptor ligations, reverse transcription, and PCR amplification, which were performed according to themanufacturer’s protocol. To remove adaptor-only ligation products, after every step of the protocol (3′ adapter ligation and 5′ adapter ligation) samples were purified using 15% acrylamide- 8M Urea gels. Lastly, after limited PCR amplification PCRproducts were size-selected on a 3.5% NuSieve (Lonza Walk- ersville MD, USA) low-melting point agarose gel. Expectant PCR products were eluted using the ‘crush and soak’ method, followed by purification using a Qiagen min-elute PCR column.Samples were then first cloned into the Topo TA vector for pilot analysis and then sequenced using 100 bp paired-end sequencing on an Illumina HiSeq 2500.Protein samples were resuspended in SDS sample buffer, and then separated on 4–12% NuPAGE gel (Invitrogen, Thermo fisher Scientific, Waltham, MA, USA), transferred onto nitro-cellulose membranes, blocked with 5% fat-free milk in PBST, and incubated with primary antibodies for 16 h at 4 °C. After several washes with PBST, the membranes were incubated with secondary antibodies conjugated to alkaline phosphatase for 1 h at room temperature, and a signal was developed with ECLreagents (GE Healthcare, Pittsburg, PA, USA). Images were obtained using a saruparib Typhoon 9400.