Along with an Agile team and environmental scientists, I have developed end-to-end cloud data pipelines for anomaly detection and quality control of sensor-collected water quality parameters (pH, electrical conductivity and temperature) and spotting of out of water sensors, including automated notifications.I have also implemented protocols and methodologies using open-source Python libraries for data analysis and quality control of hydrological data with a focus on aquatic sensors.
      
    
   
  
  
    
    
      
      In this project, I used data from the Cape York catchment geochemistry assessment to predict sediment types using four different machine learning algorithms. Code available in GitHub: http://bitly.ws/gUHK
      
    
   
  
  
    
    
      
      I analysed a chemical dataset (with over 3,000 observations) and 35 variables from a lagoon sediment core using statistical, machine learning techniques and Bayesian modeling leading to a novel record of climate and environmental changes in north Queensland. I performed the data cleaning, wrangling, manipulation, and visualisation in RStudio. Machine learning techniques included: clustering (hierarchical cluster), ordination (principal component, principal curves), and regression (linear) to identify and delineate periods of significant environmental changes.
      
    
   
  
  
    
    
      
      In this subproject derived from my doctoral thesis, I created a Bayesian model to assign ages to the sediments of a lagoon in northern Queensland. This model allowed to identify periods of environmental change in the sequence. Code available at: https://github.com/mariariveraaraya/Environmental_climate_change_Australia
      
    
   
  
  
    
    
      
      In this subproject derived from my doctoral thesis, I reconstructed the vegetation, hydrology and fire of the area using seven different geochemical and biological techniques. Using several data mining techniques, such as hierachical clustering analysis, principal curves and regression, I identified periods of pronounced environmental change. Additionally, regional climatic events, such as the reactivation of the monsoon and sea level rise are reflected in the local ecosystems along a diversity of biogeochemical responses.
      
    
   
  
  
    
    
      
      We used data from various Earth observing satellites and in situ stations to analyze and monitor the current state of meteorological and agricultural drought across the Arenal-Tempisque watershed using two indices. The Standardized Precipitation Index (SPI) was used to monitor meteorological drought and the Scaled Drought Condition Index (SDCI) was used to monitor agricultural drought. The team also created information for water balance assessment (modelling stream flow and evapotranspiration rates) using the Soil Water Assessment Tool (SWAT) model by combining NASA earth observations, ancillary data sources, and in situ data.
      
    
   
  
  
    
    
      
      This study demonstrated the ability to identify early-stage mangrove degradation using data collected from Terra and the Landsat series. Biophysical characteristics of mangroves were determined through the evaluation of chlorophyll content (CHL), leaf area index (LAI), and gross primary productivity (GPP). Sentinel-2 and ASTER data were used to enhance the spatial resolution. More information about the project: https://github.com/mariariveraaraya/NASA_monitoring_mangroves_India. Project video: https://www.youtube.com/watch?v=ohhWhS_BM_I