Environmental Protection Agency
Enterprise Data Inventory - Volume and composition over time
M-13-13 Milestone 14 - February 28th 2017
OMB Review In Progress: OMB is currently reviewing the agency for this milestone. This review status indicator will change once the review is complete.
Leading Indicators
These indicators are reviewed by the Office of Management and Budget
Review Status | in-progress |
---|---|
Reviewer | Bryant Renaud |
Last Updated | May 3, 2017, 1:33 pm EDT by Bryant Renaud |
Assessment Summary
Other: Searching EPA on data.gov for EDI pulls two different EDIs. Possible to consolidate? https://catalog.data.gov/organization/epa-gov?q=enterprise+data+inventory&sort=none
Significant amount of download links are HTML.
EPA links that appear broken are, for the most part, a function of an HTTP head request error with some ESRI hosted datasets.
Inventory Composition
Public Dataset Status
Dataset Link Quality
Status | Indicator | Automated Metrics |
---|---|---|
Overall Progress this Milestone | ||
Inventory Updated this Quarter | ||
2310 | Number of Datasets | |
1655 | Number of APIs | |
1 | Bureaus represented | |
100.0% | Percentage of bureaus represented | |
19 | Programs represented | |
15.6% | Percentage of programs represented | |
2251 | Number of public datasets | |
4 | Number of restricted public datasets | |
55 | Number of non-public datasets | |
5.7% | Percentage growth in records since last quarter | |
To a great extent (50-75%) | To what extent is your agency’s Enterprise Data Inventory (EDI) complete? | |
What steps have you taken to ensure your Enterprise Data Inventory is complete | ||
Agency provides a public Enterprise Data Inventory on Data.gov | ||
Agency provided updated Enterprise Data Inventory to OMB | ||
100% | License specified | Crawl details |
Number of datasets with redactions | ||
Percent of datasets with redactions |
Status | Indicator | Automated Metrics |
---|---|---|
Overall Progress this Milestone | ||
2310 | Number of Datasets | Crawl details |
24 | Number of Collections | Crawl details |
2080 | Number of datasets not contained in a collection | Crawl details |
2010 | Number of Public Datasets with File Downloads | Crawl details |
1655 | Number of APIs | Crawl details |
1604 | Number of public APIs | Crawl details |
Number of restricted public APIs | Crawl details | |
51 | Number of non-public APIs | Crawl details |
3773 | Total number of access and download links | Crawl details |
Quality Check: Links are sufficiently working | Crawl details | |
996 | Quality Check: Accessible links | Crawl details |
1615 | Quality Check: Redirected links | Crawl details |
16 | Quality Check: Error links | Crawl details |
960 | Quality Check: Broken links | Crawl details |
3.2% | Quality Check: Percentage of download links in correct format as specified in metadata | Crawl details |
28.7% | Quality Check: Percentage of download links in HTML | Crawl details |
2.1% | Quality Check: Percentage of download links in PDF | Crawl details |
5.7% | Percentage growth in records since last quarter | |
100% | Valid Metadata | Crawl details |
/data exists | Crawl details | |
Provides datasets in human-readable form on /data | ||
/data.json | Crawl details | |
Harvested by data.gov | ||
2251 | Number of public datasets | Crawl details |
4 | Number of restricted public datasets | Crawl details |
55 | Number of non-public datasets | Crawl details |
5.6% | Percent growth of public datasets | |
0.0% | Percent growth of restricted public datasets | |
10.0% | Percent growth of non-public datasets | |
Percent datasets licensed as U.S. Public Domain | ||
Percent datasets licensed as Creative Commons Zero | ||
Percent datasets with other licenses | ||
Percent datasets with no license |
Status | Indicator | Automated Metrics | ||
---|---|---|---|---|
Overall Progress this Milestone | ||||
Description of feedback mechanism delivered | Crawl details | |||
Data release is prioritized through public engagement | ||||
Provided narrative evidence of data improvements based on public feedback this quarter | ||||
Feedback loop is closed, 2 way communication | ||||
See below | Link to or description of Feedback Mechanism | |||
https://developer.epa.gov/forums/forum/dataset-qa/ | ||||
Provides valid contact point information for all datasets |
Status | Indicator | Automated Metrics | ||
---|---|---|---|---|
Overall Progress this Milestone | ||||
Data Publication Process Delivered | Crawl details | |||
Information that should not to be made public is documented with agency's OGC | ||||
See below | Describe the agency's data publication process | |||
EPA has a number of policies and procedures concerning the publication of Agency data. The Enterprise Information Management Policy requires all EPA Organization officials, employees, and individuals or non-EPA organizations, if applicable, to ensure information is cataloged and or labeled with metadata. This includes geographic references, as appropriate, in EPA and Federal-wide registries, repositories or other information systems. The EPA GeoPlatform Publishing Workflow Standard Operating Procedure and the EPA Environmental Dataset Gateway (EDG) Governance Structure and Standard Operating Procedure outline the details of EPA data publishing. |
Automated Metrics
These metrics are generated by an automated analysis that runs every 24 hours until the end of the quarter at which point they become a historical snapshot
data.json
Expected Data.json URL | http://www.epa.gov/data.json (From USA.gov Directory) |
---|---|
Resolved Data.json URL | https://edg.epa.gov/data.json |
Number of Redirects | 3 redirects |
HTTP Status | 200 |
Content Type | application/json |
Valid JSON | Valid |
Datasets with Valid Metadata | 100%(2310 of 2310) |
Valid Schema | Valid |
Datasets | 2310 |
Number of Collections | 24 |
Number of datasets not in a collection | 2080 |
Datasets with Distribution URLs | 87.0% (2010 of 2310) |
Datasets with Download URLs | 79.6% (1839 of 2310) |
Total Distribution URLs | 3773 |
Total Download URLs | 2118 |
Total APIs | 1655 |
Public APIs | 1604 |
Restricted Public APIs | 0 |
Non-public APIs | 51 |
Public Datasets | 2251 |
Restricted Public Datasets | 4 |
Non-public Datasets | 55 |
Normally there would be a set of quality assurance fields here to verify that the download links included within the metadata are functioning properly, but the results of those tests are not currently available. | |
Bureaus Represented | 1 |
Programs Represented | 19 |
License Specified | 100% (2310 of 2310) |
Datasets with Redactions | 0.0% (0 of 2310) |
Redactions without explanation (rights field) | 0.0% (0 of 2310) |
File Size | 6.88MB |
Last modified | Monday, 13-Feb-2017 13:47:38 EST |
Last crawl | Tuesday, 28-Feb-2017 23:00:11 EST |
Analyze archive copies | Analyze archive from 2017-02-28 |
Nearby Daily Crawls |