Category: Subjects

ApplicationAnalysis7Recommendations.docx

Application Analysis 7: Recommendations No unread replies.No replies. Instructions We have been analyzing several conflicts over the course of this semester. This last application analysis requires that you choose one of those conflicts and provide recommendations for the parties involved. You only need to make one post, but it should be thorough (about 2 paragraphs). […]

Read More

WK1STUDENTREPLIES.docx

STUDENT REPLIES STUDENT REPLY #1 Gina R Tracy CRIME SCENE RATIONALE Expertise in managing crime scenes is a crucial part of any investigation since the evidence gathered would paint a picture of what transpired for the judge and jury to evaluate (Brandl, 2018). To complete this image, we will use information gathered from the crime […]

Read More

Activity19ForgiveSomeone.docx

Activity 19: Forgive Someone 55 unread replies.55 replies. Instructions: Forgiving someone who has harmed you can be a difficult process, especially if you have been holding on to your hurt for a long time. Forgiveness is a gift that victims choose to give. Multiple studies have shown there are powerful benefits to forgiveness, such as […]

Read More

introduction-to-data-mining-2nbsped-2017048641-9780133128901-013312890311.pdf

INTRODUCTIONTODATAMINING INTRODUCTIONTODATAMININGSECONDEDITION PANG-NINGTANMichiganStateUniversitMICHAELSTEINBACHUniversityofMinnesotaANUJKARPATNEUniversityofMinnesotaVIPINKUMARUniversityofMinnesota 330HudsonStreet,NYNY10013 Director,PortfolioManagement:Engineering,ComputerScience&GlobalEditions:JulianPartridge Specialist,HigherEdPortfolioManagement:MattGoldstein PortfolioManagementAssistant:MeghanJacoby ManagingContentProducer:ScottDisanno ContentProducer:CaroleSnyder WebDeveloper:SteveWright RightsandPermissionsManager:BenFerrini ManufacturingBuyer,HigherEd,LakeSideCommunicationsInc(LSC):MauraZaldivar-Garcia InventoryManager:AnnLam ProductMarketingManager:YvonneVannatta FieldMarketingManager:DemetriusHall MarketingAssistant:JonBryant CoverDesigner:JoyceWells,jWellsDesign Full-ServiceProjectManagement:ChandrasekarSubramanian,SPiGlobal Copyright©2019PearsonEducation,Inc.Allrightsreserved.ManufacturedintheUnitedStatesofAmerica.ThispublicationisprotectedbyCopyright,andpermissionshouldbeobtainedfromthepublisherpriortoanyprohibitedreproduction,storageinaretrievalsystem,ortransmissioninanyformorbyanymeans,electronic,mechanical,photocopying,recording,orlikewise.Forinformationregardingpermissions,requestformsandtheappropriatecontactswithinthePearsonEducationGlobalRights&Permissionsdepartment,pleasevisitwww.pearsonhighed.com/permissions/. Manyofthedesignationsbymanufacturersandsellerstodistinguishtheirproductsareclaimedastrademarks.Wherethosedesignationsappearinthisbook,andthepublisherwasawareofatrademarkclaim,thedesignationshavebeenprintedininitialcapsorallcaps. LibraryofCongressCataloging-in-PublicationDataonFile Names:Tan,Pang-Ning,author.|Steinbach,Michael,author.|Karpatne,Anuj,author.|Kumar,Vipin,1956-author. Title:IntroductiontoDataMining/Pang-NingTan,MichiganStateUniversity,MichaelSteinbach,UniversityofMinnesota,AnujKarpatne,UniversityofMinnesota,VipinKumar,UniversityofMinnesota. Description:Secondedition.|NewYork,NY:PearsonEducation,[2019]|Includesbibliographicalreferencesandindex. Identifiers:LCCN2017048641|ISBN9780133128901|ISBN0133128903 Subjects:LCSH:Datamining. Classification:LCCQA76.9.D343T352019|DDC006.3/12–dc23LCrecordavailableathttps://lccn.loc.gov/2017048641 118 ISBN-10:0133128903 ISBN-13:9780133128901 Toourfamilies… PrefacetotheSecondEditionSincethefirstedition,roughly12yearsago,muchhaschangedinthefieldofdataanalysis.Thevolumeandvarietyofdatabeingcollectedcontinuestoincrease,ashastherate(velocity)atwhichitisbeingcollectedandusedtomakedecisions.Indeed,theterm,BigData,hasbeenusedtorefertothemassiveanddiversedatasetsnowavailable.Inaddition,thetermdatasciencehasbeencoinedtodescribeanemergingareathatappliestoolsandtechniquesfromvariousfields,suchasdatamining,machinelearning,statistics,andmanyothers,toextractactionableinsightsfromdata,oftenbigdata. Thegrowthindatahascreatednumerousopportunitiesforallareasofdataanalysis.Themostdramaticdevelopmentshavebeenintheareaofpredictivemodeling,acrossawiderangeofapplicationdomains.Forinstance,recentadvancesinneuralnetworks,knownasdeeplearning,haveshownimpressiveresultsinanumberofchallengingareas,suchasimageclassification,speechrecognition,aswellastextcategorizationandunderstanding.Whilenotasdramatic,otherareas,e.g.,clustering,associationanalysis,andanomalydetectionhavealsocontinuedtoadvance.Thisneweditionisinresponsetothoseadvances. Overview Aswiththefirstedition,thesecondeditionofthebookprovidesacomprehensiveintroductiontodataminingandisdesignedtobeaccessibleandusefultostudents,instructors,researchers,andprofessionals.Areas coveredincludedatapreprocessing,predictivemodeling,associationanalysis,clusteranalysis,anomalydetection,andavoidingfalsediscoveries.Thegoalistopresentfundamentalconceptsandalgorithmsforeachtopic,thusprovidingthereaderwiththenecessarybackgroundfortheapplicationofdataminingtorealproblems.Asbefore,classification,associationanalysisandclusteranalysis,areeachcoveredinapairofchapters.Theintroductorychaptercoversbasicconcepts,representativealgorithms,andevaluationtechniques,whilethemorefollowingchapterdiscussesadvancedconceptsandalgorithms.Asbefore,ourobjectiveistoprovidethereaderwithasoundunderstandingofthefoundationsofdatamining,whilestillcoveringmanyimportantadvancedtopics.Becauseofthisapproach,thebookisusefulbothasalearningtoolandasareference. Tohelpreadersbetterunderstandtheconceptsthathavebeenpresented,weprovideanextensivesetofexamples,figures,andexercises.Thesolutionstotheoriginalexercises,whicharealreadycirculatingontheweb,willbemadepublic.Theexercisesaremostlyunchangedfromthelastedition,withtheexceptionofnewexercisesinthechapteronavoidingfalsediscoveries.Newexercisesfortheotherchaptersandtheirsolutionswillbeavailabletoinstructorsviatheweb.Bibliographicnotesareincludedattheendofeachchapterforreaderswhoareinterestedinmoreadvancedtopics,historicallyimportantpapers,andrecenttrends.Thesehavealsobeensignificantlyupdated.Thebookalsocontainsacomprehensivesubjectandauthorindex. WhatisNewintheSecondEdition? Someofthemostsignificantimprovementsinthetexthavebeeninthetwochaptersonclassification.Theintroductorychapterusesthedecisiontreeclassifierforillustration,butthediscussiononmanytopics—thosethatapply acrossallclassificationapproaches—hasbeengreatlyexpandedandclarified,includingtopicssuchasoverfitting,underfitting,theimpactoftrainingsize,modelcomplexity,modelselection,andcommonpitfallsinmodelevaluation.Almosteverysectionoftheadvancedclassificationchapterhasbeensignificantlyupdated.ThematerialonBayesiannetworks,supportvectormachines,andartificialneuralnetworkshasbeensignificantlyexpanded.Wehaveaddedaseparatesectionondeepnetworkstoaddressthecurrentdevelopmentsinthisarea.Thediscussionofevaluation,whichoccursinthesectiononimbalancedclasses,hasalsobeenupdatedandimproved. Thechangesinassociationanalysisaremorelocalized.Wehavecompletelyreworkedthesectionontheevaluationofassociationpatterns(introductorychapter),aswellasthesectionsonsequenceandgraphmining(advancedchapter).Changestoclusteranalysisarealsolocalized.TheintroductorychapteraddedtheK-meansinitializationtechniqueandanupdatedthediscussionofclusterevaluation.Theadvancedclusteringchapteraddsanewsectiononspectralgraphclustering.Anomalydetectionhasbeengreatlyrevisedandexpanded.Existingapproaches—statistical,nearestneighbor/density-based,andclusteringbased—havebeenretainedandupdated,whilenewapproacheshavebeenadded:reconstruction-based,one-classclassification,andinformation-theoretic.Thereconstruction-basedapproachisillustratedusingautoencodernetworksthatarepartofthedeeplearningparadigm.Thedatachapterhasbeenupdatedtoincludediscussionsofmutualinformationandkernel-basedtechniques. Thelastchapter,whichdiscusseshowtoavoidfalsediscoveriesandproducevalidresults,iscompletelynew,andisnovelamongothercontemporarytextbooksondatamining.Itsupplementsthediscussionsintheotherchapterswithadiscussionofthestatisticalconcepts(statisticalsignificance,p-values,falsediscoveryrate,permutationtesting,etc.)relevanttoavoidingspuriousresults,andthenillustratestheseconceptsinthecontextofdata miningtechniques.Thischapteraddressestheincreasingconcernoverthevalidityandreproducibilityofresultsobtainedfromdataanalysis.Theadditionofthislastchapterisarecognitionoftheimportanceofthistopicandanacknowledgmentthatadeeperunderstandingofthisareaisneededforthoseanalyzingdata. Thedataexplorationchapterhasbeendeleted,ashavetheappendices,fromtheprinteditionofthebook,butwillremainavailableontheweb.Anewappendixprovidesabriefdiscussionofscalabilityinthecontextofbigdata. TotheInstructor Asatextbook,thisbookissuitableforawiderangeofstudentsattheadvancedundergraduateorgraduatelevel.Sincestudentscometothissubjectwithdiversebackgroundsthatmaynotincludeextensiveknowledgeofstatisticsordatabases,ourbookrequiresminimalprerequisites.Nodatabaseknowledgeisneeded,andweassumeonlyamodestbackgroundinstatisticsormathematics,althoughsuchabackgroundwillmakeforeasiergoinginsomesections.Asbefore,thebook,andmorespecifically,thechapterscoveringmajordataminingtopics,aredesignedtobeasself-containedaspossible.Thus,theorderinwhichtopicscanbecoveredisquiteflexible.Thecorematerialiscoveredinchapters2(data),3(classification),5(associationanalysis),7(clustering),and9(anomalydetection).WerecommendatleastacursorycoverageofChapter10(AvoidingFalseDiscoveries)toinstillinstudentssomecautionwheninterpretingtheresultsoftheirdataanalysis.Althoughtheintroductorydatachapter(2)shouldbecoveredfirst,thebasicclassification(3),associationanalysis(5),andclusteringchapters(7),canbecoveredinanyorder.Becauseoftherelationshipofanomalydetection(9)toclassification(3)andclustering(7),thesechaptersshouldprecedeChapter9. Varioustopicscanbeselectedfromtheadvancedclassification,associationanalysis,andclusteringchapters(4,6,and8,respectively)tofitthescheduleandinterestsoftheinstructorandstudents.Wealsoadvisethatthelecturesbeaugmentedbyprojectsorpracticalexercisesindatamining.Althoughtheyaretimeconsuming,suchhands-onassignmentsgreatlyenhancethevalueofthecourse. SupportMaterials Supportmaterialsavailabletoallreadersofthisbookareavailableathttp://www-users.cs.umn.edu/~kumar/dmbook. PowerPointlectureslidesSuggestionsforstudentprojectsDataminingresources,suchasalgorithmsanddatasetsOnlinetutorialsthatgivestep-by-stepexamplesforselecteddataminingtechniquesdescribedinthebookusingactualdatasetsanddataanalysissoftware Additionalsupportmaterials,includingsolutionstoexercises,areavailableonlytoinstructorsadoptingthistextbookforclassroomuse.Thebook’sresourceswillbemirroredatwww.pearsonhighered.com/cs-resources.Commentsandsuggestions,aswellasreportsoferrors,canbesenttotheauthorsthroughdmbook@cs.umn.edu. Acknowledgments Manypeoplecontributedtothefirstandsecondeditionsofthebook.Webeginbyacknowledgingourfamiliestowhomthisbookisdedicated.Withouttheirpatienceandsupport,thisprojectwouldhavebeenimpossible. WewouldliketothankthecurrentandformerstudentsofourdatamininggroupsattheUniversityofMinnesotaandMichiganStatefortheircontributions.Eui-Hong(Sam)HanandMaheshJoshihelpedwiththeinitialdataminingclasses.Someoftheexercisesandpresentationslidesthattheycreatedcanbefoundinthebookanditsaccompanyingslides.StudentsinourdatamininggroupswhoprovidedcommentsondraftsofthebookorwhocontributedinotherwaysincludeShyamBoriah,HaibinCheng,VarunChandola,EricEilertson,LeventErtöz,JingGao,RohitGupta,SridharIyer,Jung-EunLee,BenjaminMayer,AyselOzgur,UygarOztekin,GauravPandey,KashifRiaz,JerryScripps,GyorgySimon,HuiXiong,JiepingYe,andPushengZhang.WewouldalsoliketothankthestudentsofourdataminingclassesattheUniversityofMinnesotaandMichiganStateUniversitywhoworkedwithearlydraftsofthebookandprovidedinvaluablefeedback.WespecificallynotethehelpfulsuggestionsofBernardoCraemer,ArifinRuslim,JamshidVayghan,andYuWei. JoydeepGhosh(UniversityofTexas)andSanjayRanka(UniversityofFlorida)classtestedearlyversionsofthebook.WealsoreceivedmanyusefulsuggestionsdirectlyfromthefollowingUTstudents:PankajAdhikari,RajivBhatia,FredericBosche,ArindamChakraborty,MeghanaDeodhar,ChrisEverson,DavidGardner,SaadGodil,ToddHay,ClintJones,AjayJoshi,JoonsooLee,YueLuo,AnujNanavati,TylerOlsen,SunyoungPark,AashishPhansalkar,GeoffPrewett,MichaelRyoo,DarylShannon,andMeiYang. […]

Read More

Module12Discussions.docx

Module 12 Discussions (worth 1 point) 121121 unread replies.121121 replies. Discuss any or all of the following: TedTalks: and and/or · You must post one original thread in order to enter the forum. Write: (1) what you already knew, which was reinforced in this module’s presentations, (2) what you learned for the first time during […]

Read More

Data Mining

How does data and classifying data impact data mining? What is association in data mining? Select a specific association rule (from the text) and thoroughly explain the key concepts. Discuss cluster analysis concepts. Explain what an anomaly is and how to avoid it. Discuss methods to avoid false discoveries.

Read More

assignment.docx

Instructions A SWOT analysis is used as a strategic planning technique by businesses and/or individuals to identify  strengths, weaknesses, opportunities, and threats to a planned project. It identifies conditions that are favorable or unfavorable to achieving the goal of the project by grouping them into categories: Strengths and Weakness are frequently internally related, while Opportunities and […]

Read More

AssignmentD.docx

From the perspective of a death scene investigator: 1. Write a descriptive narrative of the decedent and the scene (this should paint the picture for someone that was not on scene and has not seen the photos) 2. Identify 3 items of potential evidence, other than the body itself, and explain how each item may […]

Read More