Tuesday, June 4, 2019
Data warehouse and data mining
selective information w arhouse and info exploitAbstractData mining and information store is one of an important trim back in a corporate world today. The biggest challenge in a world that is full of information is searching through it to find connections and selective information that were non previously fill inn. Dramatic advance in selective information development make the role of entropy mining and info w behouse plow important in order to improve billet operation in organization. The scenarios of important data mining and data warehouse in organization are seen in the summons of accumulating and integ valuation of vast and growing amounts of data in various format and various databases. This paper is discuss about data warehouse and data mining, the belief of data mining and data warehouse, the beam of lights and techniques of data mining and also the benefits of data mining and data warehouse to the organizations.KeywordsData, Data warehouse, Data Mining, Data Mart inventionOrganizations tend to grow and prosper as they gain a better understanding of their environment. Typically, business managers must be able to track daily minutes to evaluate how the business is performing. By tapping into the operational database, management can develop strategies to meet organizational goals. The make that identified the trends and patterns in data are the factors to put through that. By the behavior, the way to handle the operational data in organization is important because the reason for generating, storing and managing data is to create information that becomes the basis for rational close making. To facilitate the closing-making process, decision remain firm systems (DSSs) were developed whitheras it is an arrangement of computerized tools used to assist managerial decision making within a business. Decision support is a methodology that designed to extract information from data and to use such information as a basis for decision making. However, information requirements have become so complex that is difficult for a DSS to extract all necessary information from the data structures typically found in an operational database. Therefore, a data mining and data warehouse was developed and become a proactive methodology in order to support managerial decision making in organization. invention of Data WarehouseA data warehouse is a firms repositories that running the process of updating and storing diachronic business data of organization whereas the process then transform the data into multidimensional data model for efficient querying and analysis. All the data stored are extracts or obtains its data from multiple operational systems in organization with containing the information of relevant activity that occurred in the past in order to support organizational decision making. A data mart, on the another(prenominal) hand, is a subset of a data warehouse. It holds some special information that has been grouped to su stain business in making better decisions. Data used here are usually derived from data warehouse. The first organized used of such large(p) database started with OLAP (Online Analytical Processing) whereas the focused is analytical processing of organization. The diffrences between a data mart and a data warehouse is alone the size and scope of the problem creation solved.According to William H.Inmon (2005), a data warehouse is a subject-oriented, integrated, snip-varying, and non-volatile aggregation of data in support of the managements decision-making process. To understand that definition, the components depart be explained to a greater extent detailedIntegratedProvide a unified view of all data elements with a common definition and representation for all business units.Subject-orientedData are stored with a subject orientation that facilitates multiple views of the data and facilitates decision making. For example, gross revenue whitethorn be recorded by product, by divi sion, by manager, or by region.Time-variantDates are recorded with a historical perspective in mind. Therefore, a time dimension is added to facilitate data analysis and various time comparisons.NonvolatileData cannot be changed. Data are added only periodically from historical systems. Once the data are properly stored, no changes are allowed. Therefore, the data environment is relatively static.In summary, the data warehouse is usually a read-only database optimized for data analysis and query processing. Typically, data are extracted from various sources and are then transformed and integrated, in other words, passed through a data filter, before being loaded into the data warehouse. Users access the data warehouse via front-end tools and end-user application software to extract the data in usable form.The Issues That Arise in Data WarehouseAlthough the centralized and integrated data warehouse can be a very attractive proposition that yields many benefits, managers may be relu ctant to marry this strategy. Creating a data warehouse requires time, money, and considerable managerial effort. Therefore, it is not surprising that many companies begin their foray into warehousing by focusing on more(prenominal) manageable data sets that are targeted to meet the special needs of small groups within the organization. These smaller data warehouse are called data marts. A data mart is a small, single-subject data warehouse subset that provides decision support to a small group of wad. Some organizations choose to implement data marts not only because of the lower cost and shorter implementation time, but also because of the current technological advances and inevitable people issues that make data marts attractive. Powerful computers can provide a customized DSS to small groups in ways that might not be possible with a centralized system. Also, a companys culture may predispose its employees to resist study changes, but they might quickly embrace relatively min or changes that lead to demonstrably improved decision support. In addition, people at different organizational levels are likely to require data with different summarization, aggregation, and presentation formats. Data marts can serve as a test vehicle for companies exploring the capability benefits of data warehouses. By migrating dawdlingly from data marts to data warehouses, a specific departments decision support needs can be addressed within a conceivable time frame (six month to one year), as compared to the longer time frame usually required to implement a data warehouse (one to terzetto years). Information Technology (IT) departments also benefit from this approach because their personnel have the opportunity to learn the issues and develop the skills required to create a data warehouse.Concept of Data MiningData mining is the forecasting techniques and analytical tools that extensively used in industries and corporates to ensure the effectiveness in decision making. Da ta mining is a tools to go bad the data, uncoer problems or opportunities hidden in the data relationships, form computer models based on their findings, and then use the models to look for business behavior by requiring minimal end-user intervention. The way it works is through search of valuable information from a huge amount of data that is collected over time and defined the patterns or relationships of information that present by data. In business field, the organization use data mining to bespeak the customer doings in the business environment. The process of data mining started from analyzed the data from different perspectives and summarized it into useful information, which from the information then created knowledge to address any twist of business problems. For the example, banks and credit card companies use knowledge-based analysis to detect fraud, thitherby decreasing fraudulent transactions. In fact, data mining has proved to be very helpful in finding practical relationships among data that help define customer buying patterns, improve product development and acceptance, reduce healthcare fraud, analyze stock markets and so on.Data Mining in Historical PerspectiveOver the last 25 years or so, there has been a gradual evolution from data processing to data mining. In the 1960s business routinely collected data and processed it using database management techniques that allowed an orderly itemisation and tabulation of the data as well as some query activity. The OLTP (Online Transaction Processing) became routine, data retrieval from stored data bacame faster and more efficient because of the handiness of new and better storage devices, and data processing became quicker and more efficient because of advancement in computer technology. Database management advanced rapidly to include highly sophisticated query systems, and became popular not only in business applications but also in scientific inquiries.Approaches of Data Mining in Various IndustriesWith data mining, a retail store may find that current products are sold more in one channel of dispersal than in the others, indisputable products are sold more in one geographical lieu than in others, and certain products are sold when a certain event occurs. With data mining, a financial analyst would like to know the characteristics of a successful prospective employee credit card departments would like to know which potential customers are more likely to pay back the debt and when a credit card is swiped, which transaction is fraudulent and which one is legitimate direct marketers would like to know which customers purchase which types of products booksellers like Amazon would like to know which customers purchase which types of books (fiction, detective stories or any other kind) and so on. With this type of information available, decision makers will make better superiors. Human resource people will hire the right individuals. Credit departments will target tho se prospective customers that are less addicted to become delinquent or less likely to involve in fraudulent activities. Direct marketers will target those customers that are likely to purchase their products. With the brain wave gained from data mining, businesses may wish to re-configure their product offering and emphasize specific features of a product. These are not the only uses of data mining. Police use this tool to determine when and where a crime is likely to occur, and what would be the nature of that crime. Organized stock changes detect fraudulent activities with data mining. Pharmaceutical companies mine data to predict the efficacy of compounds as well as to uncover new chemical entities that may be useful for a particular disease. The airline industry uses it to predict which flights are likely to be delayed (well before the flight is scheduled to depart). Weather analyst determine weather patterns with data mining to predict when there will be rain, sunshine, a hu rricane, or snow. Beside that, nonprofit companies use data mining to predict the likelihood of individuals making a donation for a certain cause. The uses of data mining are far reaching and its benefits may be quite significant.Data Mining Tools and TechniquesData mining is the set of tools that learn the data obtained and then using the useful information for business forecasting. Data mining tools use and analyze the data that exist in databases, data marts, and data warehouse. A data mining tools can be categorized into four categories of tools which are prediction tools, classification tools, clustering analysis tools and association rules discovery. Below are the elobaration of data mining toolsPrediction ToolsA prediction tool is a method that derived from traditional statistical forecasting for predicting a value of the variable.Classification ToolsThe classification tools are attempt to distinguish the differences between classes of objects or actions. Given the example i s an advertiser may want to know which aspect of its promotion is most appealing to consumers. Is it a price, quality or reliability of a product? Or maybe it is a special feature that is missing on competitive products. This tools help top such information on all the products, making possible to use the advertising budget in a most effective manner.Clustering digest ToolsThis is very powerful tools for clustering products into groups that naturally fall together which are the groups are identified by the program. Most of the clusters discovered may not be useful in business decision. However, they may find one or two that are extremely important which the ones the company can take avail of. The most common use is market segmentation which in this process, a company divides the customer base into segments dependent upon characteristics like income, wealth and so on. severally segment is then treated with different marketing approach.Association Rules DiscoveryThis tool discover associations which are like what kinds of books certain groups of people read, what products certain groups of people purchase and so on. Businesses use such information in targeting their markets. For instance, recommends movies based on movies people have watched and rated in the past.There are four ordinary contours in data mining which are data preparation, data analysis and classification, knowledge acquisition and prognosis.Data PreparationIn the data preparation phase, the master(prenominal) data sets to be used by the data mining operation are identified and cleaned of any data impurities. Because the data in the data warehouse are already integrated and filtered, the data warehouse usually is the target set for data mining operations.Data AnalysisThe data anlysis and classification phase studies the data to identify common data characteristics or patterns. During this phase, the data mining tool applies specific algorithm to findData groupings, classifications, clusters, or sequences.Data dependencies, links, or relationships.Data patterns, trends, and deviations.Knowledge AcquisitionThe knowledge-acquisition phase uses the results of the data analysis and classification phase. During the knowledge-acquisition phase, the data mining tool (with possible intervention by the end user) selects the countenance modeling or knowledge-acquisition algorithms. The most common algorithms used in data mining are based on aflutter networks, decision trees, rules induction, genetic algorithms, classification and regression trees, memory-based reasoning, and nearest neighbor and data visualization. A data mining tool may use many of these algorithms in any combination to hand a computer model that reflects the behavior of the target data set. PrognosisAlthough many data mining tools stop at the knowledge-acquisition phase, others continue to the prognosis phase. In that phase, the data mining findings are used to predict future behavior and forecast business o utcomes. Examples of data mining findings can be65% of customers who did not use a particular credit card in the last six months are 88% likely to cancel that account.82% of customers who bought a 27-inch or larger TV are 90% likely to buy an entertainment center within the next four weeks.If age 30 and income = 25,000 and credit rating 25,000, then the minimum loan term is ten years.The complete set of findings can be represented in a decision tree, a neural net, a forecasting model, or a visual presentation interface that is used to assure future events or results. For example, the prognosis phase might project the likely outcome of a new product rollout or a new marketing promotion.The Benefit and Weaknesess of Data Warehouse to OrganizationData warehouse is the one of powerful techniques that applies in organization in order to assist managerial decision making within a business. This methodology becomes a life-or-death asset in modern business enterprise. It is designed to extract information from data and to use such information as a basis for decision making. The organization will get more benefit with application of data warehouse because the features of data warehouse itself is its a central repositories that stores historical information, meaning put that eventhough the data come from differ location and various points in time but all the relevant data are assembled in one location and was organized in efficient manner. Indirectly, it makes a profit to company because it greatly reduces the computing cost. One of the advantage of using data warehouse is it allows the accessible of large volume information whereas the information will be used in problem solving that arise in business organization. All the data that are from multiple sources that located in central repository will be analyze in order to allow them come out with a choice of solutions.However there are also having weaknesses that need to concern as well. The processes of data wareh ouse actually take a long period of time bacause before all the data can be stored into warehouse, they need to cleaned, extracted and loaded. The process of maintaining the data is one of the problems in data warehouse because it is not easy to handle. The compatibility may be the isssued in order to implement the data warehouse in organization because the new transaction system that tried to implement may not work with the system that already used. Beside that, the user that works with the system must be trained to use the system because without having a proper training may cause a problem. Furthermore, if the data warehouse can be accessed via the internet, the security problem might be the issue. The biggest problem that related with the data warehouse is the be that must taken into consideration especially for their maintenance. Any organization that is considering using a data warehouse must decide if the benefits outweigh the costs.Conclusionsuccessfully supporting manageria l decision-making is significantly dependent upon the availability of integrated, high quality information organized and presented in a timely and in simply way to understand. Data mining and data warehouse have emerged to meet this need. The application of data mining and data warehouse will be apart of crucial element in organization in order to assist the managerial running the operation smoothly and at the same time will help them to accomplish the business goal. It is because both of these techniques are the foundation of decision support system. Today data mining and data warehouse are an important tools and more companies will begin using them in the future. REFERENCESBonifati, A., Cattaneo, F., Ceri, F., Fuggetta, A., and Paraboschi, S., (2001). Designing data marts for data warehouse. ACM Transactions On Software Engineering And Methodology, 10, 452-483. Retrieved February 15, 2010 from http//www.emeraldinsight.com.ezaccess.library.uitm.edu.my/perceptiveness/viewPDF.jsp?co ntentType=ArticleFilename=html/Output/ produce/EmeraldAbstractOnlyArticle/Pdf/2810110103.pdfChaplot, P., (2007). An introduction to data warehousing. Retrieved February 14, 2010 from http//www.emeraldinsight.com.ezaccess.library.uitm.edu.my/Insight/viewPDF.jsp?contentType=ArticleFilename=html/Output/Published/EmeraldFullTextArticle/Pdf/0291000304.pdfRoiger, R.,J., (2005). Teaching an introductory course in data mining. Retrieved February 13, 2010 fromhttp//delivery.acm.org/10.1145/1070000/1067620/p415-roiger.pdf?key1=1067620key2=7107846621coll=ACMdl=ACMCFID=76668031CFTOKEN=26856088Santos, R., J., and Bernandino, J. Real-time data warehouse effect methodology. Retrieved February 13, 2010 from http//www.emeraldinsight.com.ezaccess.library.uitm.edu.my/Insight/viewPDF.jsp?contentType=ArticleFilename=html/Output/Published/EmeraldFullTextArticle/Pdf/0291010105.pdfChowdhury, S., Chan, J.,O., (2007). Data warehousing and data mining a course in mba and msis program from uses perspective. D ata Warehousing And Data Mining. 7. Retrieved February 15, 2010 from http//www.emeraldinsight.com.ezaccess.library.uitm.edu.my/Insight/viewPDF.jsp?contentType=ArticleFilename=html/Output/Published/EmeraldFullTextArticle/Pdf/1640150202.pdfRanjan, J., Malik, K., (2007). Effective educational process a data mining approach. The Journal Of Information And Knowledge Management Systems. 37, 502-515. Retrieved February 16, 2010 fromhttp//www.emeraldinsight.com.ezaccess.library.uitm.edu.my/Insight/viewPDF.jsp?contentType=ArticleFilename=html/Output/Published/EmeraldFullTextMora, S., L., Trujillo, J., Song, I, Y., (2006). A uml profile for multidimensional modeling in data warehouses. Data Knowledge Engineering. 59, 725-769. Retrieved February 20, 2010 from http//www.sciencedirect.com.ezaccess.library.uitm.edu.my/science?_ob=MImg_imagekeyMarch, S., T., Hevner, A., R., (2005). Integrated decision support systems a data warehousing perspective. Retrieved February 21, 2010 fromhttp//delivery.a cm.org/10.1145/1460000/1451949/p49santos.pdf?key1=1451949key2=1956846621coll=ACMdl=ACMCFID
Subscribe to:
Post Comments (Atom)
No comments:
Post a Comment
Note: Only a member of this blog may post a comment.