学科分类
/ 5
100 个结果
  • 简介:Sequentialpatternminingisanimportantdataminingproblemwithbroadapplications.However,itisalsoachallengingproblemsincetheminingmayhavetogenerateorexamineacombinatoriallyexplosivenumberofintermediatesubsequences.Recentstudieshavedevelopedtwomajorclassesofsequentialpatternminingmethods:(1)acandidategeneration-and-testapproach,representedby(i)GSP,ahorizontalformat-basedsequentialpatternminingmethod,and(ii)SPADE,averticalformat-basedmethod;and(2)apattern-growthmethod,representedbyPrefixSpananditsfurtherextensions,suchasgSpanforminingstructuredpatterns.Inthisstudy,weperformasystematicintroductionandpresentationofthepattern-growthmethodologyandstudyitsprinciplesandextensions.Wefirstintroducetwointerestingpattern-growthalgorithms,FreeSpanandPrefixSpan,forefficientsequentialpatternmining.ThenweintroducegSpanforminingstructuredpatternsusingthesamemethodology.Theirrelativeperformanceinlargedatabasesispresentedandanalyzed.Severalextensionsofthesemethodsarealsodiscussedinthepaper,includingminingmulti-level,multi-dimensionalpatternsandminingconstraint-basedpatterns.

  • 标签: 数据挖掘 顺序方向挖掘 可量测性 性能分析
  • 简介:Geological Prospecting and Mining in TibetGeologicalProspectingandMininginTibet¥DONDUINAMGYISeptember1,1995markedthe30thanniv...

  • 标签:
  • 简介:HuainanCoalMiningBureau,aspeciallargecoalenterpriseandastatekeycoalproductionbase,issituatedincentral-northpartofAnhuiProvince.Thearea,well-knownas"thecoalcapitalofEastChina",aboundsincoalresources,andtheprovencoalreserveisestimatedtobeupto70billiontonswithcompletevarietiesandsuperiorquality.Bytheyearof2010,theannualproductioncapacitywillreach30milliontons.Thereareexcellentinvestmentenvironmentandconvenientcommunicationandtransportation

  • 标签:
  • 简介:在日常生活,人们经常在某些时期重复常规线路。在这篇论文,一个采矿系统被开发经过旅行发现个人的连续线路模式。数个人动人的地位的差异的无异状,采矿系统采用记录的适应GPS数据,五个数据过滤保证clean使数据犯错。采矿系统使用客户机/服务器体系结构保护个人隐私并且减少计算负担。服务者进行主要采矿过程,但是与到recover的不够的信息,真实个人发送。改进顺序的模式采矿的可伸缩性的无异状,一个新奇模式采矿算法,连续线路模式采矿(CRPM),被建议。这个算法能容忍在真实线路和摘录的不同骚乱经常的模式。基于九个人的旅行表演,那CRPM能更长多于twotimes提取的试验性的结果比传统的线路模式采矿算法发送模式。

  • 标签: 数据采集 路线图案 全球卫星定位系统 移动电话
  • 简介:Withmassiveamountsofdatastoredindatabases,mininginformationandknowledgeindatabaseshasbecomeanimportantissueinrecentresearch.Researchersinmanydifferentfieldshaveshowngreatinterestindateminingandknowledgediscoveryindatabases.Severalemergingapplicationsininformationprovidingservices,suchasdatawarehousingandon-lineservicesovertheInternet,alsocallforvariousdataminingandknowledgediscoverytchniquestounderstandusedbehaviorbetter,toimprovetheserviceprovided,andtoincreasethebusinessopportunities.Inresponsetosuchademand,thisarticleistoprovideacomprehensivesurveyonthedataminingandknowledgediscorverytechniquesdevelopedrecently,andintroducesomerealapplicationsystemsaswell.Inconclusion,thisarticlealsolistssomeproblemsandchallengesforfurtherresearch.

  • 标签: 数据库 知识发现 机器学习 数据开采
  • 简介:Thedatausedintheprocessofknowledgediscoveryoftenincludesnoiseandincompleteinformation.Theboundariesofdifferentclassesofthesedataareblurandunobvious.Whenthesedataareclusteredorclassified,weoftengetthecoveringsinsteadofthepartitions,anditusuallymakesourinformationsysteminsecure.Inthispaper,optimalpartitioningofincompletedataisresearched.Firstly,therelationshipofsetcoverandsetpartitionisdiscussed,andthedistancebetweensetcoverandsetpartitionisdefined.Secondly,theoptimalpartitioningofgivencoverisresearchedbythecombingandpartingmethod,acquiringtheoptimalpartitionfromthreedifferentpartitionssetfamilyisdiscussed.Finally,thecorrespondingoptimalalgorithmisgiven.Therealwirelesssignalsofftencontainalotofnoise,andtherearemanyerrorsinboundarieswhenthesedataisclusteredbasedonthetradionalmethod.Inourexperimant,theproposedmethodimprovescorrectrategreatly,andtheexperimentalresultsdemonstratethemethod’svalidity.

  • 标签: 数据挖掘 信息系统 知识发现 数据集中 集合覆盖 最优算法
  • 简介:Landresourcesarefacingcrisesofbeingmisused,especiallyforanintersectionareabetweentownandcountry,andlandcontrolhastobeenforced.Thispaperpresentsadevelopmentofdataminingmethodforlandcontrol.Avector-matchmethodfortheprerequisiteofdataminingi.e.,datacleaningisproposed,whichdealswithbothcharacterandnumericdataviavectorizingcharacter-stringandmatchingnumber.Aminimaldecisionalgorithmofroughsetisusedtodiscovertheknowledgehiddeninthedatawarehouse.Inordertomonitorlandusedynamicallyandaccurately,itissuggestedtosetupareal-timelandcontrolsystembasedonGPS,digitalphotogrammetryandonlinedatamining.Finally,themeansisappliedintheintersectionareabetweentownandcountryofWuhancity,andasetofknowledgeaboutlandcontrolisdiscovered.

  • 标签: LAND CONTROL DATA MINING vector-match method
  • 简介:Overthenextdecade,thegovernmentplanstopumpinmorethan$1billiontodevelopdeep-seatechnologiestoreapcopper,cobalt,manganeseandrareearthminerals.

  • 标签: INDIA MINERALS
  • 简介:采矿诱发性是必要的提供诊断。这研究瞄准提取在多重句子或EDU(基本讲话单位)以内存在的诱发性。因为他们以某个方式成为明确,研究强调诱发性动词的使用一个原因的作为结果的事件,例如,“蚜虫从米饭叶子吮吸傻瓜。然后,叶子将缩小。后来,他们将变得黄;干燥。'.一个动词能也是在原因之间的原因动词的连接;在EDU以内完成,例如,“蚜虫从引起叶子被缩小的米饭叶子吮吸傻瓜”(“引起”用泰语等价于一个原因动词的连接)。研究面对二个主要问题:从文件识别有趣的诱发性事件;识别他们的边界。然后,我们由使用二种不同机器学习技术在动词上建议采矿,中间广场Bayes;支持向量机。结果的采矿规则将被用于鉴定;从文本的多重EDU的诱发性抽取。我们的多重EDU抽取从中间广场Bayes与0.75召回显示出0.88精确;有从支持向量机的0.76召回的0.89精确。

  • 标签: 说明知识 因果关系 边界值 计算机
  • 简介:Thispaperpresentsafault-detectionmethodbasedonthephasespacereconstructionanddataminingapproachesforthecomplexelectronicsystem.TheapproachforthephasespacereconstructionofchaotictimeseriesisacombinationalgorithmofmultipleautocorrelationandΓ-test,bywhichthequasi-optimalembeddingdimensionandtimedelaycanbeobtained.Thedataminingalgorithm,whichcalculatestheradiusofgyrationofunit-masspointaroundthecentreofmassinthephasespace,candistinguishthefaultparameterfromthechaotictimeseriesoutputbythetestedsystem.Theexperimentalresultsdepictthatthisfaultdetectionmethodcancorrectlydetectthefaultphenomenaofelectronicsystem.

  • 标签: 数据采集 故障检测 混沌时间序列 相位空间重建 拓扑结构
  • 简介:OutlierminingisanimportantaspectindataminingandtheoutlierminingbasedonCookdistanceismostcommonlyused.Butweknowthatwhenthedatahavemulticollinearity,thetraditionalCookmethodisnolongereffective.Consideringtheexcellenceoftheprincipalcomponentestimation,weuseittosubstitutetheleastsquaresestimation,andthengivetheCookdistancemeasurementbasedonprincipalcomponentestimation,whichcanbeusedinoutliermining.Atthesametime,wehavedonesomeresearchonrelatedtheoriesandapplicationproblems.

  • 标签: 外露层采矿 基本成分估计 库克距离 数字化矿业 线性回归模型
  • 简介:Asemi-structureddocumenthasmorestructuredinformationcomparedtoanordinarydocument,andtherelationamongsemi-structureddocumentscanbefullyutilized.Inordertotakeadvantageofthestructureandlinkinformationinasemi-structureddocumentforbettermining,astructuredlinkvectormodel(SLVM)ispresentedinthispaper,whereavectorrepresentsadocument,andvectors'elementsaredeterminedbyterms,documentstructureandneighboringdocuments.TextminingbasedonSLVMisdescribedintheprocedureofK-meansforbriefnessandclarity:calculatingdocumentsimilarityandcalculatingclustercenter.TheclusteringbasedonSLVMperformssignificantlybetterthanthatbasedonaconventionalvectorspacemodelintheexperiments,anditsFvalueincreasesfrom0.65-0.73to0.82-0.86.

  • 标签: HTML语言 XML语言 半结构文件模型 版本开采 结构信息