Data Analytics

Sigma2s data analytics service offers the possibility of processing large data sets. The data sets stored in researcher's project area as well as public data sets e.g. 1000 Genomes, Common crawl from anywhere using distributed processing frameworks. Researchers can also define the tools required to process their data sets using Linux containers. This service is currently in prototyping phase, but is an upcoming service in 2017.
Hvem kan benytte tjenesten: 
The data analytics service is available to staff and students at Norwegian universities and university colleges in need of storage and analysis of scientific data. Researchers in independent research institutes may also gain access to this service, provided that their work is funded by public grants and meets requirements for scientific publication.
Forutsetninger: 

Access to the data analytics service requires an active user account on the national e-infrastructure (for HPC or data services).

Status description: 
Tjenesten er i pilot . Den kan ha alle tre definerte forventninger om pålitelighet og må ha lav kritikalitet. Levetid defineres innenfor maksimum 12 måneder. Etter endt levetid kan en pilot legges ned, få forlenget levetid eller oppgraderes til produksjonstjeneste. Tilgang til en pilot kan være begrenset til en bestemt gruppe (for eksempel ansatte) eller institusjon. Det skal finnes brukerveiledning (dokumentasjon og evt. brukerstøtte).
Varighet: 
Mandag, mai 1, 2017 - 10:00 til Søndag, desember 31, 2017 - 10:00
Beskrivelse av tjenesten: 

Data analytics as service enables researchers to analyse stored data in the Sigma2 project storage service. The service provides a user friendly web interface using Jupyter Notebooks to analyse data with distributed frameworks such as Apache Spark. The service uses Linux containers (e.g. Docker) and can be adapted towards specific research group needs for frameworks and packages. The service supports Dataporten for authentication and autorization of users to enable collaboration between users from various institutions. Current supported applications:

    - Jupyter Notebook

    - Apache Spark for data processing

    - Research group's defined application using containers

Kostnad: 

The costs associated with the operation of and investments for the service is covered by grants from the Research Council of Norway and contributions from four universities, the Universitites of Tromsø, Bergen and Oslo (UiT The Arctic University of Tromsø, UiB and UiO, respectively) and the Norwegian University of Science and Technology (NTNU). These contributions grant the staff at these universities access to additional resources. From 2017 there will be a cost associated with using this service for projects with funding from the Norwegian Research Council or EU.

Bestillingsinformasjon: 

We offer this to interested pilot users or projects. Please contact sigma2@uninett.no for more information or to sign to become a pilot user.

Organisering av tjenesten: 

UNINETT Sigma2 AS manages the national infrastructure for computational science in Norway and offers services in high performance computing and data storage.

Kontaktinformasjon: