Data mining and analysis of large scale time series network data

Patricia Morreale, Steve Holtz, Allan Goncalves

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

6 Scopus citations

Abstract

Large amounts of data are readily available and collected daily by global networks worldwide. However, much of the real-time utility of this data is not realized, as data analysis tools for very large datasets, particularly time series data are cumbersome. This research presents a comparative study of three data mining tools using a large scale time series dataset from NOAA for analysis and mining. Meteorological data, gathered daily, if used at all, is useful for a very short period of time, both to help determine current weather conditions and to predict upcoming weather events. Current weather prediction methods can only guess at what the conditions will be in the near-term future, approximately one week at a time. The goal of this research project was to take large amounts of archival NOAA weather data and use appropriate data mining algorithms to identify patterns that could help predict future weather events. The results of this work identify the merits of the Rapid Miner tool over Weka and Orange, and provide future direction for data mining on massive data sets gathered from global networks.

Original languageEnglish
Title of host publicationProceedings - 27th International Conference on Advanced Information Networking and Applications Workshops, WAINA 2013
Pages39-43
Number of pages5
DOIs
StatePublished - 2013
Event27th International Conference on Advanced Information Networking and Applications Workshops, WAINA 2013 - Barcelona, Spain
Duration: 25 Mar 201328 Mar 2013

Publication series

NameProceedings - 27th International Conference on Advanced Information Networking and Applications Workshops, WAINA 2013

Conference

Conference27th International Conference on Advanced Information Networking and Applications Workshops, WAINA 2013
Country/TerritorySpain
CityBarcelona
Period25/03/1328/03/13

Keywords

  • data mining
  • data organization
  • large scale data
  • Orange
  • RapidMiner
  • streaming time series data
  • visualization
  • Weka

Fingerprint

Dive into the research topics of 'Data mining and analysis of large scale time series network data'. Together they form a unique fingerprint.

Cite this