Data Science | Strategic Science Investment Fund

DSSSIF v2.jpg

The projects in this programme use modern data science methods for earth science research and help to grow data science capabilities across GNS Science.

Overview:

Understanding data by discovering patterns and signals, is at the heart of what we do.  Extracting fundamental insights from datasets that continue to grow at high rates is only possible by using modern data science technologies. Realising this capability can greatly advance our science and benefit the people of New Zealand.

This programme is intentionally designed to support other projects and initiatives to integrate data analysis methods from AI and machine learning and modern data visualisation in everything we do. 

  • What is Artificial Intelligence and Machine Learning?

    Artificial Intelligence can be defined as the creation of machines that can undertake tasks more usually associated with human levels of intelligence. These tasks may include speak recognition, decision making or visual perception. Machine Learning is a subset of Artificial Intelligence and refers to algorithms and approaches that allow machines to learn without being explicitly programmed.

Key Projects:

Integration of Data Science into GNS

Grow integration of data science within research programmes and projects by encouraging uptake through base integration, seminars, interactive workshops and internships.

  • Base integration project is an open program component that was established to bring data science into other projects around GNS. Funding has been set aside for data scientists to be able to support projects that want to use and leverage data science expertise. 
  •  Our internship program is led by Rob Buxton, Yannik Behr, Tatiana Goded. GNS Science supports a small number of Data Science students every year. The students undertake to complete a small project over the summer break period (typically November to February). Previous projects have included the design and development of a prototype Volcano Rover – an autonomous platform for collecting data on volcanos and the redevelopment of code to process earthquake felt reports more efficiently.  
  • Seminars and workshops are hosted by the AI & AA Team, including a weekly seminar timeslot where members of team, the wider GNS community and guest speakers have presented 30 – 40 minute presentations on topics related to Data Science.

Watch our little Volcano Rover go!

The rover was developed in one of our internship projects.

rover schematic
Our previous internship projects have developed a prototype Volcano Rover - an autonomous platform for collecting data on volcanos

Science to Operations:

Operationalise data science applications on internal and external cloud computing platforms for more effective distribution of data science tools

  • Earthquake catalog generation pipeline to develop workflow-template for reproducible and operational science applications with an initial focus on earthquake catalog generation in partnership with Victoria University of Wellington.

Theory of Data Science:

Explore new machine learning and artificial intelligence methods in collaboration with peers, stakeholders, and end-users, with a focus on theoretical data science and machine intelligence

  • Manuscript on review of graph models for earth science. Work is underway to review recent (post 2015) publications on graph data structures related to earth sciences. 
  • Natural Language Processing for social science (Rob Buxton, Ceilia Wells, Christof Mueller). The AI Team collaborated with the GNS Social Science Team to investigate the possible use of Natural Language Processing techniques to automate a time consuming, previously manual exercise called “content analysis”. Natural Language Processing is ideal for this purpose since it is the science or discipline of creating machines that can understand written or spoken language. Content analysis is a process whereby media is analyzed to ascertain the underlying meaning.  
  • Probabilistic Circuits (Florent Aden, Yannik Behr): Forecasts from AI/machine learning models often come without uncertainties . This makes it difficult for us to use these forecasts with confidence. Probabilistic Circuits (PCs) are machine learning models that combine the expressiveness of deep neural networks with a probabilistic view of the data that we want to model. As a result, PCs provide probabilistic forecasts that end-users can trust.  

Data Science Methodology transfer:

Application of established machine learning and artificial intelligence methods, including deep learning, surrogate/meta modelling, and natural language processing and graph analysis

  • Ground-water meta-modelling with Neural Networks (Sapthala Karalliyadda, Conny Tschritter, Brioch Hemmings):  
    This research aims to support the metamodeling workstream of Te Whakaheke o Te Wai (TWOTW) research program by developing a neural-network model that mimics the groundwater age calculations using numerical models. This work will complement the metamodeling capabilities developed under TWOTW program. 
  • Use geodetic and seismic data to detect and analyze slow slip events (Florent Aden):  
    We explore how a Data Science can help identifying and recognizing in seismic and InSAR data, spatio-temporal patterns related to ground deformation such as creeping fault of Slow Slip events. 
  • Supervised or unsupervised segmentation of terrain data and terrain data generation (Christof Mueller): 
    GNS is working with a wealth of geospatial data (InSAR, LiDAR, Satellite Imagery, Aerial Photography, etc.). We extract knowledge about landforms and landform features from these data sets to better understand natural hazards, landform development and land-use, to give some examples. With an ever-growing amount of data, the manual interpretation of such data sets becomes an almost unsurmountable task. Like in other disciplines, such as medical imaging or image analysis in self driving cars, we work towards using deep learning machine intelligence methodology to augment the interpretation process of such data sets. 
Mueller Christof 2294

Christof Mueller Computational Geophysicist

Christof is a computational geophysicist with with an original training in physics. He has been with GNS Science since 2008. His career in our organisation started with developing advanced imaging and processing algorithms for seismic data, led him to work in tsunami research and he is now leading our Artificial Intelligence (AI) and Advanced Analytics team. His research interest still lies in Tsunami modeling, evacuation zoning and the effects of non-uniform slip on local tsunami impact. He is working in areas of data science, machine and deep learning (data generation, segmentation and surrogate modeling) with applications in tsunami, submarine and subaerial landslides. He is also interested in modern software pipelines and software engineering problems.

View Bio Contact Me

Find more content related to:

GNS Science topics

By continuing with this download you agree to abide by the rules laid out in the Terms and conditions/Terms of use listed on this page.

If there are no specific Terms and conditions/Terms of use listed then please refer to our Copyright and Disclaimer page and Privacy Policy page

Download