September 1, 2009
Center for Computational Engineering and Networking (CEN), Coimbatore
What do Computational Linguistics, Computational Drug Discovery, Bioinformatics, Chemoinformatics, Computational Business Intelligence all have in common? For one, they are potential application areas of the principles explained in Amrita’s book, Insights into Data Mining — From Theory to Practice. Published by Prentice Hall in India in 2006, recently the book was printed in the Chinese language as well.
“Many Indian universities already use this as a text book,” stated Dr. K. P. Soman, Director, CEN and one of the three co-authors of the book. “Now a Chinese publisher has obtained the copyright and has published copies of the book for use in universities across China. This book was one of seven books selected from India for use as a text-book in China.”
In India, the book is currently in its 4th print edition. Why is the book so popular? Dr. Soman explains, “Today, in modern science and engineering, there is a paradigm shift. Instead of classical modeling and analyses based on first principles, now the approach is to develop models and do the corresponding analyses directly from data. The growing use of computers has enabled a large amount of data to be readily available. This data is analyzed to derive useful models for predicting system relationships.”
Students of science and engineering are familiar with Newton’s laws of motion, Maxwell’s equations in electromagnetism, and many such other basic scientific models. Here, first-principle models are used to describe physical, biological and social systems. Experimental data is used only to verify the underlying first-principles. Sometimes, it may also be used to estimate parameters that are difficult or impossible to measure directly. Traditional science and engineering is based on this approach.
Today, however, systems under study have become more and more complex, and the underlying first principles in these domains are not readily known. This is where data mining comes in. These systems, being computer-based, generate a large amount of data, which can be easily collected. In the absence of first-principle models, this data is analyzed and mined to arrive at models. Relationships between system variables are predicted and tested. The unknown input-output dependencies are found out.
It is no wonder then that data mining today, finds many applications in science, engineering, commerce and industry. Amrita’s book Insights into Data Mining — From Theory to Practice includes most widely used methods in data mining. Simple examples and many illustrations help liven up the text. Now students in China will also learn basic principles from this text. Amrita is proud to thus contribute to the study of science and engineering by students beyond India’s borders.