USA elections 2016
      Project data and workflow
    
    
      
        Contents
        
- Data: Initial data set. It consists of 4 files, with the tweets and users for each contender.
- Classifier: The naive Bayes classifier (a jar file based on Apache Spark ML), including the training data 
- Scripts: The scripts that, executed secuentally, constitute the workflow
- Workflow: PDF file discussing each step of the workflow and showing the code
- hr.py: Python code that obtains the thresholds for the opinion labels and their averaged precission
Requirements
        The workflow requires MongoDB 3.4 or higher running locally. The classifier requires Java JRE 1.8 or higher. 
         Finally, the Python program that obtains the thresholds requires Python 3.  
       
        Please send any comment to rafacr@ucm.es