1.5 Frequently Asked Questions

1.5.0.3 Vocabulary

1.5.0.4 Tools

1.5.0.7 Accessing data

  • How do I download data from OBIS?
  • How do I load the full (.csv) export of OBIS data?

    Loading the entire OBIS dataset uses a lot of memory and is probably not feasible on most desktop computers. You have a few potential options depending on the use case: i) process the data in smaller batches, or ii) load the dataset into a local database such as SQLite and use SQL queries to analyze the data

    Otherwise, we recommend you use the parquet download which is available here, instead of the CSV. Then in R, you can use the arrow package to work with parquet files. We also have a short tutorial on working with parquet files in R here, with an example application of this approach here (see first code block).
  • How can I use R to access OBIS data?
  • How do I use the OBIS API to fetch and filter data?
  • How do I contact the data provider?
  • How can I cite OBIS datasets and downloads?
  • What are the definitions of the field names in the downloads generated by OBIS?
  • How do I obtain a taxon checklist for an area?

    There are a few possible ways to obtain a taxon checklist for a given area. We will obtain a checklist of species in the Albain EEZ as an example. To do this we will create a bounding box around our area of interest, and then apply filters to simplify the geometry.

      library(mregions)
      library(dplyr)
      library(robis)
      library(sf)
      #obtain Albanian EEZ as sf
      geom <- mr_shp(key = "MarineRegions:eez", filter = "Albanian Exclusive Economic Zone", maxFeatures = NULL)
      #get WKT for the bounding box
      wkt <- st_as_text(st_as_sfc(st_bbox(geom)), digits = 6)
      #fetch occurrences for bounding box
      occ <- occurrence(geometry = wkt) %>%
        st_as_sf(coords = c("decimalLongitude", "decimalLatitude"), crs = 4326)
      #filter using geometry
      occ_filtered <- occ %>%
        filter(st_intersects(geometry, geom, sparse = FALSE)) %>%
        as_tibble() %>%
        select(-geometry)
      #get taxa
      alb_taxa <- occ_filtered %>%
        group_by(phylum, class, order, family, genus, species, scientificName) %>%
        summarize(records = n())
  • How do I convert or obtain separate elements from dates in the data download file (e.g. date_start field)?

    The values in date_start, date_mid, and date_end are unix timestamps which have been calculated from the ISO date in the eventDate column. We can convert these numerical values to dates using the formula below.

    =(E2/86400000)+DATE(1970,1,1)

    If, when you apply this formula, you still see numbers, you will need to set the cell formatting to Date. Once you have dates, you can obtain, e.g. months for seasonal analyses using:

    =MONTH(H2)
  • How do I filter by or obtain trait information for OBIS data (e.g. all benthic organisms)?

    Currently, it is not possible to filter OBIS data by trait. To do this, we recommend using the traits database of the World Register of Marine Species. For example, searching by “functional group”, you can specify benthos, plankton, nekton, etc.