Topics of statistical analysis with social media data - ePrints ...

文章推薦指數: 80 %
投票人數:10人

This thesis investigates the use of social media data in social research from a statistical perspective. A broad review is given of how these data has been ... UniversityofSouthamptonInstitutionalRepository Search AdvancedSearch Policies&Help Latest DownloadStatistics BrowsebyYear BrowsebyDivisions LeftRight ePrintsSotonisexperiencinganissuewithsomefiledownloadsnotbeingavailable.Weareworkinghardtofixthis.Pleasebearwithus. Topicsofstatisticalanalysiswithsocialmediadata Topicsofstatisticalanalysiswithsocialmediadata Topicsofstatisticalanalysiswithsocialmediadata Thisthesisinvestigatestheuseofsocialmediadatainsocialresearchfromastatisticalperspective.Abroadreviewisgivenofhowthesedatahasbeenusedbyresearchersfromdifferentdisciplinesandtheextentandmeansoftheinvestigationscarriedoutwiththesedataisassessed.Specialattentionhasbeengiventothecommonobstaclesfacedbyusingsocialmediadataforstatisticalanalysisandtothegraphrepresentationofthesedatathatisgenerallyavailabletotheresearcherandtoitsuseforstatisticalinference.Mostoftheliteratureabouttheuseofsocialmediadataforstatisticalanalysisisconcernedwiththefactthatthesedatarepresentanon-randomsamplefromthepopulationofinterest.Wehaveinsteadhighlightedanotherfundamentalchallengepresentedbythesedata,whichis,however,rarelytakenexplicitlyintoconsideration.Theproblemisthattheobjectofsamplingandtheunitofinterestmightbedistinct.Totacklethisproblem,wehaveshownhowtwodifferentapproachesofstatisticalinferencecanbedistinguishedintheliterature.Undereachapproach,wehaveprovidedadiscussionaboutthetargetofinferenceandmakeexplicittheirlimitationsinrelationwiththestatisticalmethodsused.Ourexpositionoffersaframeworkfordealingwithunrulydatasources.However,theproblemsofnon-randomsampleandvariousunavoidablenon-samplingerrorsdonotadmitauniversallyvalidstatisticalapproach.Onecancopewiththemifneededto,butonecannotreallyhopetosolvetheseproblems.Meanwhilethegraphstructureinherentofsocialmediadata(andotherformsofbigdata)seemstousamorerewardingareaofresearch.Wehaveinvestigatedhowtousethestructureofthegraphforestimation.TheHorvitz-Thompson(HT)estimatoroperatesbyweightingeachsamplemotifbytheinverseofitsinclusionprobability.GeneralisingtheworkofBirnbaumandSirken(1965),wedemonstratedthatinfinitetypesofincidenceweightscanbeconstructedforunbiasedestimation.WedefinetheIncidenceWeightingEstimator(IWE)asalargeclassoflineardesign-basedunbiasedestimatorsbasedontheedgesoftheBipartiteIncidenceGraph(BIG),ofwhichtheHTestimatorisaspecialcase.Thisclassofestimatorhasnoequivalenceintraditionallistsampling.MorewaysofusingtheincidencestructureoftheBIGforestimationhasbeenexploredandindoingsoweenterinacompletelynewterritory.WehaveinvestigatedhowtousetheincidencestructureoftheBIGtoestimateatotalbasedonthesamplingunits,and,oncewehaveobtainedsuchestimatorwehavediscussedifandhowitcanbeusedtogetherwiththeIWEtoimprovetheinference.Wehavealsoseenthatitispossibletousethereverseincidenceweightsincombinationwiththeincidenceweights.Theweightsobtainedinsuchways,canbeusedtoconstructanunbiasedestimatorinbothdirections,althoughtheideaseemssomewhatimpracticalatthemoment.ThefinalchapterwantstoofferaflavourofwhatcanbedoneundertheBIGframeworkandinspirefutureresearchinthisdirection.Thethesisisorganisedinfourpapers:thefirstpaperdiscussesthecurrentstatisticalanalysismadeusingsocialmediadata,whiletheotherthreepapersdealwiththetopicofgraphsamplingandestimation. UniversityofSouthampton Patone,Martina 51bbd4cc-1c19-4a64-a0c2-1534b076fa79 January2020 Patone,Martina 51bbd4cc-1c19-4a64-a0c2-1534b076fa79 Zhang,Li-Chun a5d48518-7f71-4ed9-bdcb-6585c2da3649 Patone,Martina (2020) Topicsofstatisticalanalysiswithsocialmediadata. UniversityofSouthampton,DoctoralThesis,171pp. Recordtype: Thesis (Doctoral) Abstract Thisthesisinvestigatestheuseofsocialmediadatainsocialresearchfromastatisticalperspective.Abroadreviewisgivenofhowthesedatahasbeenusedbyresearchersfromdifferentdisciplinesandtheextentandmeansoftheinvestigationscarriedoutwiththesedataisassessed.Specialattentionhasbeengiventothecommonobstaclesfacedbyusingsocialmediadataforstatisticalanalysisandtothegraphrepresentationofthesedatathatisgenerallyavailabletotheresearcherandtoitsuseforstatisticalinference.Mostoftheliteratureabouttheuseofsocialmediadataforstatisticalanalysisisconcernedwiththefactthatthesedatarepresentanon-randomsamplefromthepopulationofinterest.Wehaveinsteadhighlightedanotherfundamentalchallengepresentedbythesedata,whichis,however,rarelytakenexplicitlyintoconsideration.Theproblemisthattheobjectofsamplingandtheunitofinterestmightbedistinct.Totacklethisproblem,wehaveshownhowtwodifferentapproachesofstatisticalinferencecanbedistinguishedintheliterature.Undereachapproach,wehaveprovidedadiscussionaboutthetargetofinferenceandmakeexplicittheirlimitationsinrelationwiththestatisticalmethodsused.Ourexpositionoffersaframeworkfordealingwithunrulydatasources.However,theproblemsofnon-randomsampleandvariousunavoidablenon-samplingerrorsdonotadmitauniversallyvalidstatisticalapproach.Onecancopewiththemifneededto,butonecannotreallyhopetosolvetheseproblems.Meanwhilethegraphstructureinherentofsocialmediadata(andotherformsofbigdata)seemstousamorerewardingareaofresearch.Wehaveinvestigatedhowtousethestructureofthegraphforestimation.TheHorvitz-Thompson(HT)estimatoroperatesbyweightingeachsamplemotifbytheinverseofitsinclusionprobability.GeneralisingtheworkofBirnbaumandSirken(1965),wedemonstratedthatinfinitetypesofincidenceweightscanbeconstructedforunbiasedestimation.WedefinetheIncidenceWeightingEstimator(IWE)asalargeclassoflineardesign-basedunbiasedestimatorsbasedontheedgesoftheBipartiteIncidenceGraph(BIG),ofwhichtheHTestimatorisaspecialcase.Thisclassofestimatorhasnoequivalenceintraditionallistsampling.MorewaysofusingtheincidencestructureoftheBIGforestimationhasbeenexploredandindoingsoweenterinacompletelynewterritory.WehaveinvestigatedhowtousetheincidencestructureoftheBIGtoestimateatotalbasedonthesamplingunits,and,oncewehaveobtainedsuchestimatorwehavediscussedifandhowitcanbeusedtogetherwiththeIWEtoimprovetheinference.Wehavealsoseenthatitispossibletousethereverseincidenceweightsincombinationwiththeincidenceweights.Theweightsobtainedinsuchways,canbeusedtoconstructanunbiasedestimatorinbothdirections,althoughtheideaseemssomewhatimpracticalatthemoment.ThefinalchapterwantstoofferaflavourofwhatcanbedoneundertheBIGframeworkandinspirefutureresearchinthisdirection.Thethesisisorganisedinfourpapers:thefirstpaperdiscussesthecurrentstatisticalanalysismadeusingsocialmediadata,whiletheotherthreepapersdealwiththetopicofgraphsamplingandestimation. TextThesisFinalMP -VersionofRecord AvailableunderLicenseUniversityofSouthamptonThesisLicence. Download(1MB) TextPermissiontodepositthesis_signed(1)_RW RestrictedtoRepositorystaffonly Moreinformation Publisheddate:January2020 Identifiers LocalEPrintsID:443415 URI:http://eprints.soton.ac.uk/id/eprint/443415 PUREUUID:7d0c2764-93e1-45aa-af73-8bf852569054 ORCIDforLi-ChunZhang: orcid.org/0000-0002-3944-9484 Cataloguerecord Datedeposited:24Aug202016:35 Lastmodified:13Dec202103:09 Exportrecord ASCIICitationAtomBibTeXDataCiteXMLDublinCoreDublinCoreEP3XMLEndNoteHTMLCitationHTMLCitationHTMLListJSONMETSMODSMPEG-21DIDLOpenURLContextObjectOpenURLContextObjectinSpanRDF+N-TriplesRDF+N3RDF+XMLRIOXX2XMLReferReferenceManagerSimpleMetadata Sharethisrecord SharethisonFacebookSharethisonTwitterSharethisonWeibo Contributors Author: MartinaPatone Thesisadvisor: Li-ChunZhang Downloadstatistics DownloadsfromePrintsoverthepastyear.Otherdigitalversionsmayalsobeavailabletodownloade.g.fromthepublisher'swebsite. Viewmorestatistics Librarystaffadditionalinformation Atom RSS1.0 RSS2.0 ContactePrintsSoton:[email protected] ePrintsSotonsupportsOAI2.0withabaseURLofhttp://eprints.soton.ac.uk/cgi/oai2 ThisrepositoryhasbeenbuiltusingEPrintssoftware,developedattheUniversityofSouthampton,butavailabletoeveryonetouse. Weusecookiesto ensurethatwegiveyouthebestexperienceonourwebsite. Ifyoucontinuewithoutchangingyoursettings,wewillassumethatyouarehappytoreceivecookies ontheUniversityofSouthamptonwebsite. ×



請為這篇文章評分?