Paper Title
DNA Probe Signal Processing for identification of Abnormal Gene Regulation and Pathogenetic Understanding - A Data Mining Approach

Gene expression microarray leverages DNA probes to acquire signal intensity in the hybridized biological samples, and has become a major source for producing high-throughput experiment data. The raw, probe-level signal leads to a compre- hensive understanding of the overall microarray data set, which is especially useful when the goals of the research are different from the original data producer or contributor. Dissecting the genetic basis of complex diseases and understanding their pathogenesis thereby hinges on the successful processing of the DNA probe- level signal. Moreover, starting exploration from raw probe- level signal ensures the integrity of original data from being compromized, thus usually yielding reasonable instinct towards choosing the precise algorithms or techniques for further analysis. In this paper, we present steps towards processing probe-level signal from the microarray. As case studies of our approach, two public data sets are then used, starting from scratch: one describes the gene expression in synchronous and metachronous liver metastatic lesions from colorectal cancer, the other one uses biopsies from patients with EBV-positive undifferentiated nasopharyngeal carcinoma and from cancer-free controls. Com- pared with previous work, our approach not only identifies up/ down-regulated genes, but discovers insightful pathogenesis as well. Keywords- Microarray, DNA probes, Signal intensity, Gene reg- ulation, Quality assessment, Filtering, Multiple testing, Taxonomic clustering, Pathogenesis