000002707 001__ 2707 000002707 005__ 20260120144901.0 000002707 0247_ $$2DOI$$a10.6083/M4DJ5CZ2 000002707 037__ $$aETD 000002707 245__ $$aDetecting and analyzing genomic structural variation using distributed computing 000002707 269__ $$a2014 000002707 336__ $$aDissertation 000002707 502__ $$bPh.D. 000002707 502__ $$gComputer Science & Electrical Engineering (sunsetting) 000002707 520__ $$aGenomic structural variations are an important class of genetic variants with a wide va- riety of functional impacts. The detection of structural variations using high-throughput short-read sequencing data is a difficult problem, and published algorithms do not pro- vide the sensitivity and specificity required in research and clinical settings. Meanwhile, high-throughput sequencing is rapidly generating ever-larger data sets, necessitating the development of algorithms that can provide results rapidly and scale to use cloud and cluster infrastructures. MapReduce and Hadoop are becoming a standard for managing the distributed processing of large data sets, but existing structural variation detection approaches are difficult to translate into the MapReduce framework. We have formulated a general framework for structural variation detection in MapReduce, and implemented a software package called Cloudbreak, which detects genomic deletions and insertions with very high accuracy compared to existing popular tools. 000002707 540__ $$fCC BY 000002707 542__ $$fIn copyright - single owner 000002707 650__ $$aComputational Biology$$031511 000002707 650__ $$aMachine Learning$$011449 000002707 650__ $$aArtificial Intelligence$$015109 000002707 650__ $$aGenomics$$033016 000002707 650__ $$aGenomic Structural Variation$$038764 000002707 691__ $$aSchool of Medicine$$041369 000002707 692__ $$aCenter for Spoken Language Understanding$$041388 000002707 7001_ $$aWhelan, Christopher$$uOregon Health and Science University$$041354 000002707 7201_ $$aSönmez, Kemal$$uOregon Health and Science University$$041354$$7Personal$$eAdvisor 000002707 8564_ $$9d5fa625e-1b28-425c-8049-f261313db7fc$$s3742101$$uhttps://digitalcollections.ohsu.edu/record/2707/files/3482_etd.pdf$$ePublic$$2696cc8eadc814c7a2344b2d80481a51d$$31 000002707 905__ $$a/rest/prod/js/95/6f/97/js956f97k 000002707 909CO $$ooai:digitalcollections.ohsu.edu:2707$$pstudent-work 000002707 956__ $$aGet Accessible Copy$$uhttps://ohsu.libwizard.com/f/requestaccessibledocument 000002707 980__ $$aTheses and Dissertations