Introduction¶
The Environmental Sensor Map is a data exploration tool with a customized public web interface that allows scientists, managers, and the general public to discover and access public data.
The Environmental Sensor Map has 2 major components:
Data Map
Data Views
See the following sections of this help documentation for more information about each of these components:
Catalog Overview¶
Download Data¶
In addition to visualizing a dataset you can download datasets by clicking the download button and selecting among the options in the popup window. Data files may be accessed using interoperability services (i.e. ERDDAP, THREDDS), downloaded individually in different file formats, or bundled for download using the Download Queue. See below for more information about data format.
Data Formats¶
There are several ways to download gridded data from the Environmental Sensor Map:
THREDDS
NetCDF Subset
ERDDAP
OPeNDAP
WMS
THREDDS¶
Thematic Realtime Environmental Distributed Data Services (THREDDS) is a set of services provided by Unidata that allows for machine and human access to raster data stored in NetCDF formats. THREDDS provides spatial, vertical, and temporal subsetting, as well as the ability to select individual dimension or data variables to reduce file transfer sizes. The most commonly used THREDDS services for users are NetCDF Subset, and Open-source Project for a Network Data Access Protocol (OpenDAP).
Note
All THREDDS servers have a bandwidth limit, and it will not allow you to download more than the cap in one go. So you won’t be able to download 1 Tb of data with a single request. If you need a lot of data, you will need to break up your requests to download the dataset incrementally (e.g., one month at a time; one variable at a time, etc.). If you’re grabbing a lot of data programmatically, sometimes it’s easiest to grab just one time slice at a time using a loop.
NetCDF Subset¶
The NetCDF Subset protocol looks through all the datasets NetCDF files stored on our server, and provides an human-readable or machine-readable interface to subset the data by time, geography, or variable. For more information, please see the Download Using NetCDF page.
Tip
When you initially request a dataset via NetCDF Subset, the server may take a long time to respond if the dataset is large (i.e., thousands of files). Be patient, it’s not broken! If your web browser times out (e.g., after 10 minutes of waiting), you can try reloading or just giving it a few more minutes and then reload. This won’t restart the server process, and once it’s indexed all the files things will go pretty fast.
ERDDAP¶
The ERDDAP (National Ocean and Atmospheric Administration’s Environmental Research Division’s Data Access Program) Server is a free and open-source Java “servlet” that converts and serves a variety of scientific datasets using common file formats. ERDDAP gives you a simple, consistent way to download subsets of datasets in common file formats, in addition to making graphs and maps. All information about every ERDAPP request is contained in the URL of each request, which makes it easy to automate searching for and using data in other applications. Proficient users can build their own custom interfaces. Many organizations (including NOAA, NASA, and USGS) run ERDDAP servers to serve their data. Because of its widespread use and accessibility, the ERDDAP principal developer and user community have created user guides, instruction videos, and code examples to facilitate access by new users.
OPeNDAP¶
OPeNDAP is a simpler THREDDS protocol that can provide ASCII (human-readable) or binary files. It loads very quickly, but doesn’t do any interpretation for you at all and you will need to be able to calculate or surmise the indices you need to subset the data. For example, if there are 20,000 dates listed in the file, it will give you the option of selecting 0-20,000, but it won’t tell you what those dates are. Therefore, OPeNDAP is best in cases where you are already familiar with the dataset’s bounds and speed is more important, or in cases where you just want to download the whole thing and don’t care much about subsetting.
Note
All THREDDS servers have a bandwidth limit, and it will not allow you to download more than the cap in one go. So you won’t be able to download 1 Tb of data with a single request. If you need a lot of data, you will need to break up your requests to download the dataset incrementally (e.g., try downloading half a variable first, then the second half, or one variable at a time, etc.).
For more details, please see the Download Using OpeNDAP page.
WMS¶
Web mapping services (WMS) are used to provide machine access to images used by remote mapping programs (e.g., tiling services). Accessing programs use GetCapabilities requests to ask for image data in whatever format they require, which allows them to gather image tiles over specific areas with the projections, styles, scales and formats (PNG, JPG, etc.) that fits their needs.
Selecting WMS (Web Mapping Service) under the Download
button will start the WMS service. The returned image will be projected according to the parameters set in the URL. In the example below, modifying either the parameters (e.g., changing the WIDTH
, COLORSCALERANGE
values) or the projection will redraw the image for your mapping service.
For more details, please see the Download Using WMS page.
Virtual Sensors¶
For details on how to download data from virtual sensors, please see the Download Virtual Sensor Data page.
Parsed Data¶
This section of our documentation is still under development. For assistance, please contact us via the Feedback button .
NetCDF Resources¶
NetCDF is the name of a file format as well as a grouping of software libraries that describe that format. The files have the ability to contain multidimensional data in a wide variety of data types, and they are highly optimized for file I/O. This makes them excellent at storing extremely large datasets because they can be quickly and easily sliced without putting the entire dataset into RAM.
In addition, NetCDF files can contain metadata attributes that describe any time components, dimensions, units, history, etc. Because of this, NetCDF is often called a self-describing data format and they are excellent for holding archived data, and they are the primary format preferred by the National Centers for Environmental Information (NCEI, formerly NODC).
NetCDF libraries are available for every common scientific programming language including Python, R, Matlab, ODV, Java, and more. Unidata maintains a list of free software for manipulating or displaying NetCDF data. A good, simple program to start exploring NetCDF data is Unidata’s ncdump, which runs on the command line and can quickly output netCDF data to your screen as ASCII. Unidata’s Integrated Data Viewer or NASA’s Panoply are free, relatively easy programs to use that will display gridded data, though they are not as straightforward to use as a scientific programming language.
Map Overview¶
Real-Time Data¶
Real-time data are ingested, served, and displayed in the Environmental Sensor Map at the same frequency the data are collected (and sometimes reported) by the originator with little to no delay. Examples of real-time assets include weather stations, oceanographic buoys, and webcams. For the purposes of this documentation, it’s helpful to understand how the following real-time data terms are defined:
Term |
Definition |
---|---|
Hexagonal bin |
A group of stations that are aggregated into a hexagon for visual summary. |
Station |
A device that collects data related to the weather and environment using many different sensors (e.g. weather station). |
Sensor |
An individual measurement device affixed or associated with a station (e.g. thermometer, barometer). |
Parameter |
The type of value measured by the sensor (e.g. temperature, pressure). |
Real-time data from observation stations are aggregated into hexagonal bins to visually summarize data over a large spatial area when the map is zoomed out. This means that data from more than one station may be displayed within a hexagon. The color of the hexagon represents the average value of the selected sensor parameter within that hexagon. For example, if air temperature is the selected sensor type, then the hexagon color will reflect the average temperature for all stations within that bin.
To view a summary of the station data contained within a hexagon, hover your mouse over the hexagon. The number of stations aggregated within that hexagon will be displayed as n stations. The average value for the selected sensor type will be also be shown, followed by the time range for which that value was measured. If there are not more than one station aggregated within a hexagon, the hover-over view will display the value for the selected parameter, followed by a list of the other sensor types associated with that station and the range of associated data. By default, only five of the sensors are shown in the hover window. More sensors are indicated by the n more sensors in the lower left of the window.
To view data for an individual station, zoom in on the map. The hexagons will soften into points that represent the individual stations that were aggregated into that hexagon. To view current readings from that station, hover over its point. As shown in the image below, a pop-up window will display some basic information about the station, including its name, data source affiliation(s), latitude and longitude, current readings, and available sensor parameters (e.g., air temperature, water level, and water temperature as in the example below).
To view station data, click on the point. As shown in the image below, data from the station will appear in the data display window in the lower left corner of the window. You can use the dropdown menu in the data display window to select data from different sensors, and you can use the Time Slider to adjust the time period of the data.
Historical Data¶
Historical data are data that are one month old or older. Historical data available through the portal were sometimes collected in real-time and subsequently archived; other historical data are ingested from local or national archives upon stakeholder request.
Customize Data in the Map¶
You can view and interact with the data in a number of ways. As with other interactive maps, you can pan and zoom to adjust the view to your area of interest. Additionally, you can click on a data point of interest to open a chart that summarizes the data. A time slider at the bottom of the map can be used to move back and forth through time for timeseries data. More information about these features is provided below.
Filter Data¶
In the map, your selected layers will appear in a legend on the right. The filters in the legend can be used to change the parameters on the map. You can select among the measurements that are available using the caret, or by toggling on/off the checkboxes. The exact filters or measurements available vary by the data layer being shown.
Toggle Layers On/Off¶
Individual data layers can be toggled on and off using the``Eyeball`` icon to the right of the data layer name. To delete the data layer from the map, select the X
icon.
Change Layer Order¶
The order in which data layers appear in the map can be changed. By default, the data layer that appears at the top of the map legend will be displayed forward in the map. To move data layers backward in the map, select the Up/Down Arrow
to the left of the data layer name.
Customize Color and Scale¶
The data layer legend on the right hand side shows the color scale that is used to represent the unit of measurement. You can change the palette and scale settings by clicking on the color bar. Select among the different color palettes using the drop down menu. The legend scale can be changed by either adjusting the scale slider, or by clicking on the gear icon and entering or advancing the bounds control interval. When the map is zoomed in, the scale and color for that area can be automatically set for the data in view by clicking the Autoset for data view
button.
For more details, please see the Customize Layers page.
Time Slider¶
The time slider bar at the bottom of the map allows you to view temporal data. The time intervals available will vary depending on which data layer you are viewing. The bar is unavailable if there is not any time-enabled data layers loaded. By default, the time slider is set to display the most recent data that is available for that data layer.
Tip
For quick reference, the time range for data being viewed in the map is shown in the right-hand map legend beneath the data layer title.
The temporal extent for the data layers can be viewed by hovering your mouse over the time slider control. The name ofthe data layer, the begin and end dates for the data, and a line graph of the temporal range will appear. The temporal information will appear for all time-enabled datasets that are currently loaded in the map. For more details, please see the How To Time Slider page.
Depth Filter¶
The depth slider bar located in the bottom right of the map allows you to filter data across the water column. The depth intervals available will vary depending on which data layer you are viewing. The bar is unavailable if there is not any depth-enabled data layers loaded. By default, the depth slider is set to display all data across the water column.
Tip
For quick reference, the depth range for data being viewed in the map is shown in the right-hand map legend beneath the time extent.
For more details, please see the Filter by Depth page.
For other ways to filter data in the map, please see the Filter Data page.
Data Charts¶
The catalog and map offer multiple ways of comparing data within both the mapped interface and within a Data Views.
Different Chart Types¶
This section includes descriptions for the common charts used to display data in the portal. Data charts can be accessed both by clicking a point on a data layer in the map, or by using the custom Data Views interface.
Categorical Variables¶
Bar charts: compare the size or frequency of different categories. Since the values of a categorical variable are labels for the categories, the distribution of a categorical variable gives either the count or the percent of individuals falling into each category.
Quantitative Variables¶
Line charts: display points connecting the data to show a continuous change over time. In the map, the line chart shows the current values together with historical statistics. The x-axis shows the occurrences and the categories being compared over time and the y-axis represents the scale, which is a set of numbers organized into equal intervals.
Histograms: show the frequency of distribution for the observations. A histogram is constructed by representing the measurements or observations that are grouped on a horizontal scale, the interval frequencies on a vertical scale, and drawing rectangles whose bases equal the class intervals and whose heights are determined by the corresponding class frequencies.
Box plots: are useful for identifying outliers and for comparing distributions. The boxplot is a graph of a five-number summary: the minimum score, first quartile (Q1-the median of the lower half of all scores), the median, third quartile (Q3-the median of the upper half of all scores), and the maximum score. The boxplot consists of a rectangular box, which represents the middle half of all scores (between Q1 and Q3). Approximately one-fourth of the values should fall between the minimum and Q1, and approximately one-fourth should fall between Q3 and the maximum. A line in the box marks the median. Lines called whiskers extend from the box out to the minimum and maximum scores.
Dot plots: consist of data points plotted on a fairly simple scale. Dot plots are suitable for small to moderate sized data sets to highlight clusters and gaps, as well as outliers. When dealing with larger data sets (around 20–30 or more data points) the box plot or histogram may be more efficient, as dot plots may become too cluttered after this point.
Curtain plots: show a visual summary of vertical profiling data. If data is available at depth, the chart will show depth on the y-axis with the values represented by colors.
For more details, please see the Customize Data Charts page.
Climatology and Anomaly Charts¶
If there are more than three years of data coverage, charts show statistics from past weather patterns along with the current data. These are not officially climatologies, which typically require 30 years of data, but they can still be useful to quickly compare how the current year compares to the long-term average.
Observational Statistics¶
By default, if there are too many observations to easily show on the time-series, the observations binned by default for display. Graphs may show the following:
Mean: The mean line represents the average value of all observations within each time bin.
Min/Max envelope: The envelope represents the extent of observations within each time bin.
Interannual Statistics¶
Interannual statistics are calculated on physical time-series where available data coverage in the system is longer than three years. Statistics are derived for days, weeks, months, seasons, and years based on the Gregorian calendar by:
binning the observations into the selected time periods,
combining the time bins across years (e.g, for daily bins, combining all data from April 13th regardless of year; for monthly bins, combine all data from all Aprils), and
calculating statistics for each interannual time bin.
For interannual statistics, we calculate the following:
Mean: The mean represents the average value of all observations within each time bin, across all recorded years.
Low: The low represents the minimum value of all observations within each time bin, across all recorded years.
High: The high represents the maximum value of all observations within each time bin, across years.
Mean to 10%, Mean to 90%: Percentiles are calculated by ordering all values in the time bin across all recorded years and selecting the value at the 10% and 90% locations in the array (i.e., the shaded percentile region relays what the typical temperature is at that time of year excluding the 10% most extreme values on either end of the distribution).
Anomaly Plots¶
Anomalies are available wherever interannual statistics are available (i.e., in all time-series where available data coverage in the system is longer than three years, but are only available on data binned on days or more).
Anomalies are calculated by calculating the mean value of the observational bin and subtracting the interannual statistical bin for that time period. For example, the daily anomaly for April 13th, 2016 is calculated by taking the average temperature for that day minus the mean interannual April 13th temperature.
Customize Data Charts¶
The table below contains a key to several of the important terms used in describing the Environmental Sensor Map’s chart capabilities:
Term |
Description |
---|---|
Minimum |
The minimum value of the entire time-series within each bin, represented by a dashed blue line. |
Mean to the 10th percentile |
The range from the mean to the 10th percentile of the data is represented by a blue shaded area. |
Mean |
The mean of the entire time-series within each bin, represented by a dashed gray line. |
Mean to the 90th percentile |
The range from the mean to the 90th percentile of the data is represented by a red shaded area. |
Maximum |
The maximum value of the entire time-series within each bin is represented by a dashed red line. |
Line chart |
A chart of the current values with historical statistics. |
Climatology |
Year-to-date monthly mean values of the current year compared to historical statistics. |
Anomaly |
The data values minus the mean values across all years. |
Curtain |
If data is available at depth, the chart will show depth on the y-axis with the values represented by colors. |
Time Bins¶
Data can be binned across years within the following time periods:
Time period |
Definition |
---|---|
All |
No binning. |
Hours |
Data are binned by hour and daily statistics are displayed (see below). |
Days |
Data are binned by day and statistics are by day number across years. |
Weeks |
Data are binned by week, and statistics are by week number across years. |
Months |
Data are binned by month, and statistics are by month number across years. |
Seasons |
Data are binned by northern hemisphere seasons defined as the following:
|
Years |
Data are binned by years, and statistics are across years. |
Note
Percentiles are calculated by ordering all values in the time bin across all recorded years and selecting the value at the 10% and 90% locations in the array. I.e., the shaded percentile region is telling you what the typical temperature is at that time of year excluding the 10% most extreme values on either end.
For more information on how to customize charts, refer to the Customize Data Charts section.
Quality Control (QARTOD)¶
Quality control algorithms are run on datasets and quality flag results are shown for visual exploration. The data quality procedures meet the U.S. Integrated Ocean Observing System (IOOS) Quality Assurance of Real Time Ocean Data (QARTOD) maintained through the IOOS QC library (ioos_qc). The automated QC algorithms do not screen out or delete any data, or prevent it from being downloaded. The algorithms only flag “suspect” or “fail” data points for visualization and deliver those flags as additional variables in downloaded data.
Roll up quality flags summarizing pass, suspect, and failed values can be seen under Inventory.
Data quality flags for individual data points can be seen within the data charts. Documentation of the test code and thresholds are linked to under QC information in the lower left of the chart.
Download Data¶
Data may be downloaded as described in the Download Data how-to page.
Data Views Overview¶
You can save a collection of data layers and visualize them together for comparison and analysis. These collections are called data views, and they are accessed by clicking on the views button in the toolbar along the top of the window.
Within the portal there are several pre-made data views that highlight environmental events or locations of interest. You can access these pre-made views from the portal landing page or by clicking on the views button and selecting a view from the dropdown menu
The view will open, displaying data comparison charts for you to explore, as seen in the example below.
Note
If you need assistance creating a particular view, please contact us via the feedback button in the top right corner of the upper toolbar.
For more details, please see the Data Views section of the Data Views How-To page.