TY - JOUR
T1 - Knowledge discovery from soil maps using inductive learning
AU - Qi, Feng
AU - Zhu, A. Xing
PY - 2003/12
Y1 - 2003/12
N2 - This paper develops a knowledge discovery procedure for extracting knowledge of soil-landscape models from a soil map. It has broad relevance to knowledge discovery from other natural resource maps. The procedure consists of four major steps: data preparation, data preprocessing, pattern extraction, and knowledge consolidation. In order to recover true expert knowledge from the error-prone soil maps, our study pays specific attention to the reduction of representation noise in soil maps. The data preprocessing step has exhibited an important role in obtaining greater accuracy. A specific method for sampling pixels based on modes of environmental histograms has proven to be effective in terms of reducing noise and constructing representative sample sets. Three inductive learning algorithms, the See5 decision tree algorithm, Naïve Bayes, and artificial neural network, are investigated for a comparison concerning learning accuracy and result comprehensibility. See5 proves to be an accurate method and produces the most comprehensible results, which are consistent with the rules (expert knowledge) used in producing the soil map. The incorporation of spatial information into the knowledge discovery process is found not only to improve the accuracy of the extracted knowledge, but also to add to the explicitness and extensiveness of the extracted soil-landscape model.
AB - This paper develops a knowledge discovery procedure for extracting knowledge of soil-landscape models from a soil map. It has broad relevance to knowledge discovery from other natural resource maps. The procedure consists of four major steps: data preparation, data preprocessing, pattern extraction, and knowledge consolidation. In order to recover true expert knowledge from the error-prone soil maps, our study pays specific attention to the reduction of representation noise in soil maps. The data preprocessing step has exhibited an important role in obtaining greater accuracy. A specific method for sampling pixels based on modes of environmental histograms has proven to be effective in terms of reducing noise and constructing representative sample sets. Three inductive learning algorithms, the See5 decision tree algorithm, Naïve Bayes, and artificial neural network, are investigated for a comparison concerning learning accuracy and result comprehensibility. See5 proves to be an accurate method and produces the most comprehensible results, which are consistent with the rules (expert knowledge) used in producing the soil map. The incorporation of spatial information into the knowledge discovery process is found not only to improve the accuracy of the extracted knowledge, but also to add to the explicitness and extensiveness of the extracted soil-landscape model.
UR - http://www.scopus.com/inward/record.url?scp=0242485179&partnerID=8YFLogxK
U2 - 10.1080/13658810310001596049
DO - 10.1080/13658810310001596049
M3 - Article
AN - SCOPUS:0242485179
SN - 1365-8816
VL - 17
SP - 771
EP - 795
JO - International Journal of Geographical Information Science
JF - International Journal of Geographical Information Science
IS - 8
ER -