|
Topic Classification and Predictive Modeling of Oceanographic Data using Data Mining Techniques
Abstract Oceanographic data from devices such as the Acoustic Doppler Current Profiler (ADCP) used in this study are high in dimension and are intensive in processing and interpretation. Data mining techniques have proven useful in various applications to these types of datasets for ease and depth of data analysis. This study uses both unsupervised and supervised machine learning to analyze and model data collected from an ADCP. Principal component analysis (PCA) was applied to reduce the dimensionality of the data for visualization and cluster analysis. The main common features in both datasets included physical and chemical properties, such as temperature, location, and error velocity. Similarities in the common features driving the formation of the clusters show that PCA was able to consistently identify the most important features in the data. Support vector machines (SVM) using various kernel functions and constant values were extremely accurate in organizing the data into classes defined by the transect it was collected from, with the dot product and polynomial kernel functions having the highest classification accuracy overall. These machine learning techniques were successful in the analysis of ADCP data and may be applied to other similar oceanographic datasets in future studies.
Faculty Advisor: Sam Abuomar, Computing Sciences
Graduate Student Mentor: Ike Vayansky, Computing Sciences
|