TY - GEN
T1 - Data mining and analysis of large scale time series network data
AU - Morreale, Patricia
AU - Holtz, Steve
AU - Goncalves, Allan
PY - 2013
Y1 - 2013
N2 - Large amounts of data are readily available and collected daily by global networks worldwide. However, much of the real-time utility of this data is not realized, as data analysis tools for very large datasets, particularly time series data are cumbersome. This research presents a comparative study of three data mining tools using a large scale time series dataset from NOAA for analysis and mining. Meteorological data, gathered daily, if used at all, is useful for a very short period of time, both to help determine current weather conditions and to predict upcoming weather events. Current weather prediction methods can only guess at what the conditions will be in the near-term future, approximately one week at a time. The goal of this research project was to take large amounts of archival NOAA weather data and use appropriate data mining algorithms to identify patterns that could help predict future weather events. The results of this work identify the merits of the Rapid Miner tool over Weka and Orange, and provide future direction for data mining on massive data sets gathered from global networks.
AB - Large amounts of data are readily available and collected daily by global networks worldwide. However, much of the real-time utility of this data is not realized, as data analysis tools for very large datasets, particularly time series data are cumbersome. This research presents a comparative study of three data mining tools using a large scale time series dataset from NOAA for analysis and mining. Meteorological data, gathered daily, if used at all, is useful for a very short period of time, both to help determine current weather conditions and to predict upcoming weather events. Current weather prediction methods can only guess at what the conditions will be in the near-term future, approximately one week at a time. The goal of this research project was to take large amounts of archival NOAA weather data and use appropriate data mining algorithms to identify patterns that could help predict future weather events. The results of this work identify the merits of the Rapid Miner tool over Weka and Orange, and provide future direction for data mining on massive data sets gathered from global networks.
KW - data mining
KW - data organization
KW - large scale data
KW - Orange
KW - RapidMiner
KW - streaming time series data
KW - visualization
KW - Weka
UR - http://www.scopus.com/inward/record.url?scp=84881396926&partnerID=8YFLogxK
U2 - 10.1109/WAINA.2013.92
DO - 10.1109/WAINA.2013.92
M3 - Conference contribution
AN - SCOPUS:84881396926
SN - 9780769549521
T3 - Proceedings - 27th International Conference on Advanced Information Networking and Applications Workshops, WAINA 2013
SP - 39
EP - 43
BT - Proceedings - 27th International Conference on Advanced Information Networking and Applications Workshops, WAINA 2013
T2 - 27th International Conference on Advanced Information Networking and Applications Workshops, WAINA 2013
Y2 - 25 March 2013 through 28 March 2013
ER -