000000157 001__ 157 000000157 005__ 20260410124443.0 000000157 0247_ $$2DOI$$a10.6083/M4PZ56RX 000000157 037__ $$aETD 000000157 245__ $$aA data cleaning and annotation framework for genome-wide studies 000000157 260__ $$bOregon Health and Science University 000000157 269__ $$a2007 000000157 336__ $$aThesis 000000157 502__ $$bM.S. 000000157 502__ $$gComputer Science & Electrical Engineering (sunsetting) 000000157 520__ $$aGenome-wide studies depend heavily on the quality and consistency of integrated annotation data from diverse computational and experimental sources. This thesis presents a generalized framework for detecting and managing discrepancies within and between annotation datasets by addressing biological identity, data relationships, source independence, and conflicts. The workflow identifies errors and either resolves or incorporates inconsistencies into downstream analyses. The framework’s utility is demonstrated through construction of a genome-wide mouse transcription factor binding map and classification of single nucleotide polymorphisms. We further examine the impact of annotation discrepancies on downstream analyses and discuss future extensions for biologically meaningful summarization of inconsistencies. 000000157 540__ $$fCC BY 000000157 542__ $$fIn copyright - single owner 000000157 650__ $$aGenomics$$033016 000000157 650__ $$aWorkflow$$038890 000000157 650__ $$aTranscription Factors 000000157 650__ $$aMice$$036842 000000157 650__ $$aPolymorphism, Single Nucleotide$$032570 000000157 650__ $$aComputational Biology$$031511 000000157 650__ $$aGenome-Wide Association Study$$038168 000000157 6531_ $$aannotations 000000157 6531_ $$abiological data source 000000157 6531_ $$adata cleaning 000000157 691__ $$aOGI School of Science and Engineering$$041365 000000157 692__ $$aDepartment of Computer Science and Engineering$$041405 000000157 7001_ $$aRamakrishnan, Ranjani$$uOregon Health and Science University$$041354 000000157 7201_ $$aMcWeeney, Shannon$$uOregon Health and Science University$$041354$$7Personal$$eAdvisor 000000157 8564_ $$99e2340f6-6a9c-4215-8f43-8de69b580775$$s550824$$uhttps://digitalcollections.ohsu.edu/record/157/files/157_etd.pdf$$ePublic$$29ef8a816adf8e0d1e8a8b3e420f27616$$31 000000157 901__ $$a<p>These documents are archival records. They are retained for historical reference only. </p><p><b>Need an accessible version? Use the ‘Get Accessible Copy’ link above.</b></p> 000000157 905__ $$a/rest/prod/vq/27/zn/41/vq27zn41g 000000157 909CO $$ooai:digitalcollections.ohsu.edu:157$$pstudent-work 000000157 956__ $$aGet Accessible Copy$$uhttps://ohsu.libwizard.com/f/requestaccessibledocument 000000157 980__ $$aTheses and Dissertations